NLP Projects
Exploring Text Generation, Text clasification and Speaker identification.
Text Classification
1. Aggressive Language identification using Vocabulary Graph Convolutional Neural Network
Developed a robust model for detecting offensive and hate speech in text using Graph Neural Networks (GNNs). I represented vocabulary as a graph to capture word relationships, enhancing contextual understanding. Implemented the solution in PyTorch, utilizing various frameworks for efficient data processing and model training. The project achieved high accuracy in identifying harmful language.
Read more
- VGCN for Aggressive detection |
paper_here
git_repo
2. Sentiment Analysis for Spanish (Peru) using SVM
I developed a machine learning project using Support Vector Machines (SVM) for text classification. I collected real-time tweets from Twitter to train the model, focusing on sentiment analysis. The project demonstrated SVM’s effectiveness in classifying textual data accurately.
Read more
- Analisis masivo de datos en twitter para identificacion de opinion
thesis_online_repor
git_repo
Text Generation
1. Quechua Indigenous Lyrics Generation using LSTM.
Developed a text generation model for the Quechua language using LSTM networks. Collaborated with a team to create the dataset, then successfully training the model to generate coherent text lyrics of songs. This project aims to promote and preserve Quechua, enhancing its accessibility in digital formats and supporting cultural heritage.
Read more
- Generación automática de letras de canciones usando redes neuronales recurrentes para el quechua Doc repository online
thesis_online_repor - You can check out the
git_repo
Speech Recognition
1. A Comparison Study of Speaker Identification models
This project compares three machine learning models: Gaussian Mixture Model (GMM), Hidden Markov Model (HMM), and Support Vector Machines (SVM). Using the same MFCC features, HMM and GMM achieved about 94% accuracy on 100 audio recordings. SVM reached around 90% accuracy for classifying 2-5 individuals but performed poorly with more classes.
Read more
- You can check out the
git_repo_here