How I Built CardGuard: AI-Powered Phishing Detection

            Project Overview: CardGuard is an AI-powered browser extension that I developed during my high school years, earning 2nd place at the TÜBİTAK National Research Projects Competition among 2000+ projects. It combines advanced
            machine learning with blockchain security to protect users from phishing attacks with 96%+ accuracy.
        

The Problem: Rising Phishing Threats

Phishing attacks have become one of the most prevalent cybersecurity threats, with millions of users falling victim each year. Traditional blacklist-based solutions are reactive and often outdated by the time they're deployed. I realized we needed a proactive, AI-driven approach that could identify malicious sites in real-time.

96%+

Detection Accuracy

10K+

Training Dataset

2nd

TÜBİTAK National

2000+

Competing Projects

Architecture Overview

CardGuard's architecture is built on three main pillars: AI Detection Engine, Secure Data Handling, and User Privacy Protection. The system works seamlessly in the background, analyzing websites in real-time without compromising user experience.

CardGuard System Architecture Diagram

Core Components

Gemini AI

Advanced language model for URL and content analysis

k-NN Algorithm

Pattern recognition for website classification

AES-256 Encryption

Military-grade encryption for data protection

Blockchain Ledger

Decentralized data integrity verification

AI Detection Engine

The heart of CardGuard lies in its hybrid AI approach. I combined Google's Gemini AI with a custom k-Nearest Neighbors (k-NN) algorithm to create a robust detection system that analyzes multiple website characteristics simultaneously.

Feature Extraction Process

The system extracts over 50 different features from each website, including:

URL Analysis: Domain age, SSL certificate validity, suspicious subdomains
Content Analysis: Text patterns, form elements, external links
Visual Elements: Logo detection, color schemes, layout patterns
Behavioral Patterns: Redirect chains, JavaScript execution patterns

            
# Simplified feature extraction example
def extract_features(url, page_content):
    features = {
        'domain_age': get_domain_age(url),
        'ssl_valid': check_ssl_certificate(url),
        'suspicious_keywords': count_phishing_keywords(page_content),
        'form_count': count_forms(page_content),
        'external_links': count_external_links(page_content),
        'gemini_score': gemini_analysis(url, page_content)
    }
    return feature_vector(features)
            
        

Hybrid Classification Model

The magic happens when Gemini AI's contextual understanding combines with k-NN's pattern recognition. Gemini analyzes the semantic content and context, while k-NN identifies patterns similar to known phishing sites in our training dataset.

            Innovation Highlight: The hybrid approach allowed us to achieve 96%+ accuracy while maintaining low false positive rates. Gemini handles novel attack patterns that haven't been seen before, while k-NN provides reliable detection
            for known patterns.
        

Security Implementation

Security wasn't an afterthought—it was fundamental to CardGuard's design. I implemented multiple layers of protection to ensure user data remains private and secure.

AES-256 Encryption

All sensitive data is encrypted using AES-256 before transmission or storage. This includes user browsing patterns, analysis results, and any personal information that might be inadvertently collected.

Blockchain Data Integrity

To ensure the integrity of our phishing database and prevent tampering, I implemented a blockchain-based verification system. Each database update is hashed and stored in a decentralized ledger, making it virtually impossible to manipulate the training data.

            
# Blockchain verification example
class BlockchainVerifier:
    def verify_database_integrity(self, database_hash):
        latest_block = self.get_latest_block()
        stored_hash = latest_block['database_hash']
        return database_hash == stored_hash
    
    def add_update_block(self, update_data):
        new_block = {
            'timestamp': time.time(),
            'database_hash': sha256(update_data),
            'previous_hash': self.get_latest_block()['hash']
        }
        self.blockchain.append(new_block)
            
        

Dataset Creation and Training

Creating a comprehensive dataset was one of the most challenging aspects of the project. I collected over 10,000 websites, carefully labeled and categorized them, and ensured balanced representation across different types of phishing attacks.

Data Sources

PhishTank API: Verified phishing URLs from the community
Legitimate Sites: Top 1000 websites from Alexa rankings
Corporate Partners: Real-world examples from cybersecurity firms
Honeypots: Custom-deployed honeypots to catch new attacks

Model Training Process

The training process involved multiple iterations and careful hyperparameter tuning. I used cross-validation to ensure the model generalizes well to unseen data and implemented techniques to handle class imbalance.

Model Training Performance Metrics

Browser Extension Development

The browser extension serves as the user-facing component of CardGuard. Built using modern web technologies, it provides real-time protection without impacting browsing performance.

Key Features

Real-time Analysis: Every page is analyzed as it loads
Visual Warnings: Clear, non-intrusive alerts for suspicious sites
User Education: Detailed explanations of why a site is flagged
Whitelist Management: Users can manage trusted sites
Privacy Controls: Granular control over data sharing

            
// Extension background script example
chrome.tabs.onUpdated.addListener((tabId, changeInfo, tab) => {
    if (changeInfo.status === 'complete' && tab.url) {
        analyzeURL(tab.url).then(result => {
            if (result.risk_level > THRESHOLD) {
                showWarning(tabId, result);
            }
        });
    }
});
            
        

Results and Impact

The results exceeded my expectations. CardGuard achieved 96%+ accuracy in detecting phishing sites while maintaining a false positive rate below 2%. The project's success led to recognition at the national level and opened doors for further research in cybersecurity.

Competition Performance

At the TÜBİTAK National Research Projects Competition, CardGuard stood out among 2000+ projects from across Turkey. The judges were particularly impressed by the innovative use of AI and the practical applicability of the solution.

            Judge Feedback: "CardGuard represents a significant advancement in automated phishing detection. The combination of AI technologies with practical security measures makes this a valuable contribution to cybersecurity research."
        

Lessons Learned

Building CardGuard taught me invaluable lessons about AI development, cybersecurity, and project management. Here are the key takeaways:

Data Quality Matters: The success of any AI project depends heavily on the quality of training data
Security by Design: Security considerations must be integrated from the beginning, not added later
User Experience: Even the most sophisticated technology fails if users find it difficult to use
Continuous Learning: Cyber threats evolve rapidly, requiring adaptive and continuously learning systems

Future Enhancements

CardGuard is just the beginning. I'm already working on several enhancements that will make the system even more effective:

Multi-language Support: Extending detection to non-English phishing sites
Mobile Protection: Developing mobile app versions for iOS and Android
Enterprise Integration: Creating enterprise-grade solutions for organizations
Federated Learning: Implementing privacy-preserving collaborative learning

Open Source Contribution

I believe in the power of open source to advance cybersecurity research. While the core CardGuard system remains proprietary due to security considerations, I've open-sourced several components that the community can benefit from:

Feature extraction libraries
Dataset preprocessing tools
Evaluation metrics and benchmarks
Educational materials and tutorials

Conclusion

Developing CardGuard was an incredible journey that combined my passion for AI, cybersecurity, and helping people stay safe online. The project's success at TÜBİTAK validated my approach and motivated me to continue pushing the boundaries of what's possible in cybersecurity research.

For aspiring developers and researchers, my advice is simple: start with a real problem, apply cutting-edge technology thoughtfully, and never compromise on security or user privacy. The future of cybersecurity depends on innovative solutions like CardGuard, and I'm excited to continue contributing to this vital field.

            Want to learn more? Feel free to reach out if you have questions about CardGuard, AI in cybersecurity, or if you're interested in collaboration opportunities. I'm always excited to discuss technology and share knowledge with
            fellow developers and researchers.