Back to Blog

Turkish NLP with BERTürk: Fine-tuning for Sentiment Analysis

TEKNOFEST National Finalist
93% Accuracy
HEZARTECH Project Overview: As team leader, I guided our high school team to become TEKNOFEST 2024 National Finalists by developing an advanced Turkish NLP model for target-based sentiment analysis. We fine-tuned BERTürk on a custom dataset of 37,000+ Turkish customer reviews, achieving ~93% accuracy and F1 score.

The Challenge: Turkish Language NLP

Natural Language Processing for Turkish presents unique challenges due to the language's agglutinative nature, rich morphology, and complex grammar rules. Unlike English, Turkish words can take numerous suffixes, creating virtually unlimited word forms that make traditional NLP approaches struggle.

"Bu ürünü kesinlikle tavsiye etmiyorum çünkü kalitesi çok düşük ve müşteri hizmetleri berbat!"

Analysis needed: Product → Negative, Quality → Negative, Customer Service → Negative
37K+
Turkish Reviews
93%
F1 Score
5
Target Categories
1st
HS Team in Finals

Why Target-Based Sentiment Analysis?

Traditional sentiment analysis provides an overall sentiment for entire text, but real-world applications need more granular insights. In e-commerce, a customer might love a product's design but hate its delivery speed. Target-based sentiment analysis identifies specific aspects mentioned in text and determines sentiment for each.

The HEZARTECH Approach

Our team identified five key targets that matter most in Turkish e-commerce:

Dataset Creation: The Foundation

Creating a high-quality Turkish sentiment analysis dataset was our first major challenge. We needed authentic, diverse Turkish text with accurate target-based annotations.

Data Collection Strategy

We collected reviews from multiple Turkish e-commerce platforms to ensure diversity:

# Data collection pipeline def collect_turkish_reviews(): sources = [ 'hepsiburada.com', 'trendyol.com', 'amazon.com.tr', 'gittigidiyor.com' ] total_reviews = 0 for source in sources: reviews = scrape_reviews(source, limit=10000) clean_reviews = preprocess_turkish(reviews) total_reviews += len(clean_reviews) return total_reviews # 37,000+ reviews

Annotation Process

The most time-consuming part was manual annotation. Our team of 5 members labeled each review, identifying targets and their corresponding sentiments. We developed strict annotation guidelines to ensure consistency.

Annotation Interface & Guidelines

BERTürk: The Turkish BERT

BERTürk is a Turkish version of BERT (Bidirectional Encoder Representations from Transformers), pre-trained on Turkish Wikipedia and other Turkish corpora. It understands Turkish language nuances better than multilingual models.

Why BERTürk Over Other Models?

We evaluated several approaches before settling on BERTürk:

Multilingual BERT

78%

Good general performance but lacks Turkish-specific understanding

Turkish FastText

82%

Fast but struggles with context and complex sentences

BERTürk (Our Choice)

93%

Best contextual understanding and Turkish language support

Fine-tuning Process

Fine-tuning BERTürk for our specific task required careful consideration of hyperparameters, training strategy, and evaluation metrics.

Model Architecture

We modified BERTürk's output layer to handle multi-label classification, as a single review could contain multiple targets with different sentiments.

import torch from transformers import AutoModel, AutoTokenizer class TurkishSentimentClassifier(torch.nn.Module): def __init__(self, n_targets=5, n_classes=3): super().__init__() self.berturk = AutoModel.from_pretrained('dbmdz/bert-base-turkish-cased') self.dropout = torch.nn.Dropout(0.3) self.classifier = torch.nn.Linear(768, n_targets * n_classes) def forward(self, input_ids, attention_mask): outputs = self.berturk(input_ids=input_ids, attention_mask=attention_mask) pooled_output = outputs.pooler_output output = self.dropout(pooled_output) return self.classifier(output).view(-1, 5, 3) # 5 targets, 3 sentiments

Training Configuration

After extensive hyperparameter tuning, we found the optimal configuration:

Handling Turkish Language Complexities

Turkish language presented several unique challenges that required special handling:

1. Agglutination

Turkish words can take multiple suffixes, creating very long words:

"Mağazalarımızdakilerden" = "Mağaza-lar-ımız-da-ki-ler-den"
(From those that are in our stores)

2. Informal Language & Slang

Online reviews often contain informal Turkish, abbreviations, and internet slang that required special preprocessing:

def preprocess_turkish_text(text): # Handle common Turkish internet abbreviations replacements = { 'mq': 'mağaza', 'ürn': 'ürün', 'müq': 'mükemmel', 'sb': 'süper', 'krg': 'kargo' } for abbr, full in replacements.items(): text = text.replace(abbr, full) # Normalize Turkish characters text = normalize_turkish_chars(text) return text

3. Negation Handling

Turkish negation can significantly change meaning, requiring careful attention during training:

"Kalitesi iyi değil" (Quality is not good) ≠ "Kalitesi iyi" (Quality is good)

Results and Evaluation

Our final model achieved impressive results across all target categories:

93.2%
Overall F1 Score
94.1%
Product Quality
91.8%
Customer Service
92.5%
Shipping

Error Analysis

We conducted thorough error analysis to understand model limitations:

TEKNOFEST Competition Experience

Competing at TEKNOFEST as a high school team against university-level competitors was both challenging and rewarding. Our presentation focused on the practical applications and technical innovations of our approach.

Judge Feedback: "Impressive work for a high school team. The focus on Turkish language specifics and practical e-commerce applications demonstrates mature understanding of both NLP challenges and business needs."

Key Success Factors

Real-World Applications

Our Turkish sentiment analysis model has several practical applications:

Technical Challenges & Solutions

1. Data Imbalance

Natural review distributions were heavily skewed toward positive sentiments. We addressed this through:

2. Multi-label Classification

A single review could discuss multiple targets with different sentiments, requiring specialized output handling and evaluation metrics.

3. Computational Resources

Training large transformer models required optimization:

# Memory optimization techniques def optimize_training(): # Gradient accumulation for larger effective batch size accumulation_steps = 4 # Mixed precision training scaler = torch.cuda.amp.GradScaler() # Gradient checkpointing model.gradient_checkpointing_enable() return model, scaler

Future Improvements

Based on our TEKNOFEST experience and continued research, we identified several enhancement opportunities:

Lessons Learned

This project taught valuable lessons about NLP research, team leadership, and competition preparation:

Technical Insights

Team Leadership

Open Source Contribution

To give back to the Turkish NLP community, we've open-sourced several components:

Conclusion

Leading the HEZARTECH team to TEKNOFEST finals was an incredible experience that combined technical innovation with practical problem-solving. Our Turkish sentiment analysis model not only achieved competitive performance but also addressed real challenges in the Turkish e-commerce market.

The project demonstrated that with proper focus on language-specific challenges, domain expertise, and rigorous methodology, even high school students can contribute meaningful research to the NLP field. This experience strengthened my passion for AI research and confirmed my commitment to advancing Turkish language technologies.

Interested in Turkish NLP? I'm always excited to discuss Natural Language Processing, especially for Turkish and other low-resource languages. Feel free to reach out if you're working on similar problems or want to collaborate on Turkish language technologies.