LexiLearn Study Companion

Intelligent study tool that extracts unknown vocabulary and generates audio for learning

Planned EdTech

Problem Solved

Students reading in a second language encounter unknown words and lose comprehension. Digital reading tools don't provide context-aware vocabulary support. LexiLearn extracts vocabulary automatically and helps students master new words.

Core Features

  • Upload learning materials (PDF, EPUB, TXT)
  • Automatic vocabulary extraction and term frequency analysis
  • Context-aware definitions with usage examples
  • Text-to-speech generation with adjustable speed
  • Spaced repetition review system for vocabulary retention
  • Personal vocabulary flashcards with progress tracking
  • Reading statistics and comprehension insights
  • Export study materials to Anki format

Technical Architecture

┌────────────────────────────────────┐
│     Angular Frontend (SPA)         │
│   - Document Upload                │
│   - Reading Interface              │
│   - Vocabulary Dashboard           │
│   - Flashcard Reviewer             │
└──────────────┬─────────────────────┘
               │
      ┌────────┴────────┐
      │                 │
┌─────▼──────────────────▼──────┐
│   Spring Boot REST API        │
│   - Auth Service              │
│   - Document Parser           │
│   - NLP Engine (vocabulary)   │
│   - Review Service (SRS)      │
│   - TTS Service               │
└──────────────┬─────────────────┘
      ┌────────┴─────────┬──────────────┐
      │                  │              │
┌─────▼─────┐   ┌────────▼──┐  ┌──────▼────┐
│PostgreSQL │   │ MongoDB   │  │Google TTS │
│(Progress) │   │(Document  │  │  API      │
│           │   │Content)   │  │           │
└───────────┘   └───────────┘  └────────────┘
        

Tech Stack

Frontend

  • • Angular 21
  • • TypeScript
  • • RxJS
  • • TailwindCSS
  • • Web Audio API

Backend

  • • Java 17
  • • Spring Boot 3
  • • OpenNLP
  • • Apache PDFBox
  • • Google Cloud TTS

Data & DevOps

  • • PostgreSQL
  • • MongoDB
  • • Redis (for caching)
  • • AWS Lambda (document processing)

Technical Challenges

Natural Language Processing at Scale

Integrated OpenNLP for tokenization and POS tagging to extract meaningful vocabulary. Built a processing pipeline using Spring Batch to handle large PDF uploads asynchronously without blocking users.

Spaced Repetition Algorithm (SRS)

Implemented the SM-2 algorithm to calculate optimal review intervals based on user performance. Used PostgreSQL scheduling to auto-generate review tasks without manual intervention.

Future Improvements

  • AI-powered reading difficulty recommendations
  • Collaborative learning groups with shared vocabulary lists
  • Integration with popular books/articles APIs
  • Mobile app for offline studying
  • Integration with language learning platforms (Duolingo API)
  • Grammar checker with explanations

Axel Diego

Junior Full-Stack Developer

Connect

© 2026 Axel Diego. All rights reserved.

Built with Angular, TypeScript, and TailwindCSS