ADTA 5760 Group 4 | NLP with Neural Networks

Overview

About the Project

Our group project for ADTA 5760 explores the intersection of Natural Language Processing and Artificial Neural Networks, focusing on real-world applications in two critical domains.

📦

Supply Chain Management

Applying NLP techniques to extract insights from supply chain documents, including procurement reports, logistics data, and operational documentation. Our analysis reveals patterns in supplier communications, demand forecasting narratives, and risk assessment language.

🏥

Medical & Healthcare

Leveraging neural network-based NLP to process medical documents, clinical reports, and healthcare literature. Our research focuses on extracting meaningful information from unstructured medical text, identifying key entities, and understanding document sentiment and themes.

🎯

Project Objectives

Collect and preprocess domain-specific PDF documents for NLP analysis
Apply text classification, named entity recognition, and sentiment analysis techniques
Implement neural network architectures (RNN, LSTM, Transformers) for text processing
Compare NLP model performance across supply chain and medical domains
Visualize findings and present actionable insights from unstructured text data

Research

Our Research Areas

Deep-diving into two high-impact domains where NLP can transform how organizations process and understand large volumes of text data.

Supply Chain NLP Analysis

The supply chain domain generates vast amounts of unstructured text — from procurement emails and contracts to logistics reports and supplier evaluations. Our research applies NLP to make this data actionable.

                                    Documents
                                    50+ PDFs
                                

                                    Focus Areas
                                    Procurement, Logistics, Risk
                                

                                    Techniques
                                    NER, Classification, Sentiment
                                

Key Findings

Automated extraction of supplier names, contract terms, and delivery timelines
Sentiment patterns in supplier communications correlate with delivery performance
Topic modeling reveals recurring themes in supply chain risk documentation
Text classification achieves high accuracy in categorizing procurement documents

Document Classification Accuracy

Procurement

92%

Logistics

88%

Risk Reports

85%

Contracts

90%

Medical Document NLP Analysis

Healthcare organizations deal with massive amounts of clinical text, research papers, and patient documentation. NLP can help extract structured insights from this unstructured data, improving decision-making and research efficiency.

                                    Documents
                                    50+ PDFs
                                

                                    Focus Areas
                                    Clinical, Research, Reports
                                

                                    Techniques
                                    BioNER, Classification, QA
                                

Key Findings

Biomedical NER effectively identifies diseases, drugs, and treatment entities
Document classification distinguishes clinical notes from research papers
Transformer-based models outperform traditional ML on medical text tasks
Domain-specific pre-training significantly improves NLP model accuracy

Entity Recognition F1-Scores

Diseases

89%

Medications

91%

Procedures

84%

Symptoms

86%

Approach

Methodology

Our end-to-end NLP pipeline from raw PDF documents to actionable insights.

01

Data Collection

Gathering 100+ PDF documents across supply chain and medical domains from academic papers, industry reports, and public datasets.

02

Preprocessing

PDF text extraction, cleaning, tokenization, stop-word removal, lemmatization, and domain-specific preprocessing.

03

Feature Engineering

TF-IDF vectorization, word embeddings (Word2Vec, GloVe), and contextual embeddings using pre-trained transformers.

04

Model Training

Training RNN, LSTM, and Transformer-based models for text classification, NER, and sentiment analysis tasks.

05

Evaluation & Visualization

Model evaluation using accuracy, F1-score, precision, and recall. Results visualized through interactive charts and dashboards.

Technology Stack

Python Core Language

TensorFlow Deep Learning

Keras Neural Networks

spaCy NLP Pipeline

NLTK Text Processing

Hugging Face Transformers

scikit-learn ML Models

GCP Cloud Platform

Outcomes

Results & Insights

Key findings from our NLP analysis across both research domains.

0 %

Classification Accuracy

Average accuracy across document classification tasks using fine-tuned neural network models.

0 %

NER F1-Score

Named entity recognition performance on domain-specific entities in both supply chain and medical texts.

0 x

Faster Processing

Speed improvement over manual document review when using our automated NLP pipeline.

0 +

Documents Analyzed

Total PDF documents processed and analyzed across both supply chain and medical domains.

Key Insights

1

Domain Adaptation Matters

Models pre-trained on general text and fine-tuned on domain-specific data consistently outperform generic models by 8-12% on classification tasks.

2

Medical Text Is Harder

Medical documents present unique challenges — specialized vocabulary, abbreviations, and complex sentence structures require domain-specific preprocessing.

3

Transformers Win

Transformer-based architectures (BERT, BioBERT) consistently outperform RNN/LSTM baselines on all text classification and NER tasks.

4

Practical Value

Automated NLP pipelines can significantly reduce manual effort in processing supply chain and medical documents, enabling faster decision-making.

Team

Meet Group 4

Graduate students in the Advanced Data Analytics program at the University of North Texas.

KP

Karan Parekh

Team Lead & NLP Engineer

Focused on supply chain document analysis, model architecture design, and project coordination.

SP

Sanjana PR

Data Analyst

Specializing in medical document preprocessing, data cleaning, and feature engineering for NLP models.

SM

Sana Mhapsekar

ML Engineer

Working on neural network model training, hyperparameter tuning, and evaluation metrics.

MM

Medina Maloku

Research Analyst

Leading supply chain research, literature review, and result visualization and documentation.

Natural Language Processing with Artificial Neural Networks

About the Project

Supply Chain Management

Medical & Healthcare

Project Objectives

Our Research Areas

Supply Chain NLP Analysis

Key Findings

Medical Document NLP Analysis

Key Findings

Methodology

Data Collection

Preprocessing

Feature Engineering

Model Training

Evaluation & Visualization

Technology Stack

Results & Insights

Classification Accuracy

NER F1-Score

Faster Processing

Documents Analyzed

Key Insights

Domain Adaptation Matters

Medical Text Is Harder

Transformers Win

Practical Value

Meet Group 4

Karan Parekh

Sanjana PR

Sana Mhapsekar

Medina Maloku

Natural Language Processing
with Artificial Neural Networks