Biography

I am an experienced NLP professional with 2 years of LLM and Generative AI and 4 years of Conversational AI research. I have industry experience via internships in leading companies where I deployed several NLP models for real-world user problems. My research work is on the optimization of AI Chatbots.
You can ask a question to ShihabAI v1.0 at bottom right. He only knows one answer though (hehe).

Work Experience

Amazon.com

06/2023 - 09/2023: Applied Scientist Intern

  • Created a Generative AI based pipeline to extract and rewrite millions of user-required features from queries and product descriptions to improve shopper experience
  • Fine-tuned multiple Large Language Models (LLM) such as Falcon 40B for entity extraction, title rewrite tasks with memory optimizations using Deepspeed, Peft, and LoRA
  • Performed big data analysis with PySpark on EMR clusters on 136M customer query and product information data to provide insights on shopper behavior

Brain Technologies Inc.

06/2022 - 09/2022: NLP Intern

  • Created a multi-turn conversational AI in Discord to recommend and re-rank products based on user purchase data to increase user engagement
  • Collaborated with full-time engineers in 4+ projects and 1 hackathon to solve various problems related to natural language processing
  • Utilized and fine-tuned large language model (LLM) GPT-3 with few shot learning to create prototypes for named entity recognition, intent detection, and text summarization for Natural app

University of California, Riverside

07/2020 - Present: Research Assistant

Latest Projects:

  • Budget aware LLMs in Information Retrieval
  • Open Domain Conversational Question Answering
  • Task oriented dialogue systems:
    Development of exposed and internal APIs for chatbot developing platform, work on backend kernel

University of California, Riverside

01/2021 - Present: Teaching Assistant

Courses teaching:
CS 242: Information Retrieval and Web Search

Islamic University of Technology, Bangladesh

01/2019 - 08/2019: Lecturer

Responsible for conducting several undergraduate courses in the field of Computer Science and Engineering including conducting multiple lab classes. Was actively involved with several undergraduate groups with their sophomore year project.

Samsung R&D Institute Bangladesh

11/2017 - 01/2018: Software Development Intern

  • Designed prototypes for automatic speaker recognition systems to improve Bixby voice assistant
  • Followed agile software development methodology including daily scrum meeting, weekly sprint cycles and code reviews

Education

Ph.D. in Computer Science
09/2019 - Present

University Of California, Riverside
Advisor: Professor Vagelis Hristidis

B.Sc. in Computer Science
01/2015 - 10/2018

Islamic University of Technology, Bangladesh
Advisor: Professor Hasan Mahmud

Publications

Projects

  • Open Retrieval Conversational Question Answering
    Answering questions from open domain for transactional dialogue systems
    01/2021 - 12/2022
  • PsyBot: A FAQ chatbot on mental health
    A question answering FAQ chatbot trained on 1000 FAQs collected automatically using QuAX made with Transformers and Python
    04/2021 - 05/2021
  • TravelCrawl - A Reddit Based Travel Search Engine
    Search Reddit posts related to travel. Indexing using Map-reduce and Lucene. Made with Java
    01/2020 - 03/2020
  • Safety Indexing of New York City Using Big Data Techniques
    Calculation of safety index using NYC crime dataset. Visualization of crime mapping using geojson. Made with Java and Python.
    09/2019 - 12/2019
  • Health Based Ingredient Recommender System for Recipes
    Calculates health value of ingredients from neural networks and suggests ingredients that would make food healthier. Made with Python.
    09/2019 - 12/2019
  • Emotion Recognition with Forearm Based Electromyogrraphy
    This is my undergraduate thesis in the area of HCI-ML. We classified emotions from features extracted from muscle data.
    01/2018 - 10/2018
  • Daily Companion
    Every-day’s weather forecast, news and to-do lists as put together in a common interface. Made with Java.
    08/2016 - 10/2016

Skills

Programming Languages:
  • Python, Java, C++, SQL
  • Platforms/Libraries: Git, PySpark, Hadoop, Pandas, DeepSpeed, PEFT, LoRA, MongoDB, REST APIs, Flask, Docker
  • NLP: Pytorch, Keras, TensorFlow, NLTK, spaCy, Scikit-learn, Huggingface Transformers
Miscellaneous:
  • Debating. Public speaking
  • Creative thinking, Communication, Collaboration

Service

Sub-Reviewer
  • SIGMOD '23, CIKM '21, ICDE '21 '22 '23, DMAH '21, EDBT '21 '22 '23, DSAA '20, DMAH '20

Honors & Awards

  • Deans Fellowship Award, UCR
  • OIC Scholarship, for IUT admission 2014-15