Wei Yang

Full-Stack Software Engineer at PiSrc

Email: hey.weiyang@gmail.com

Phone: +1 551 260 0541

Web: inscribedeeper.github.io/

About Me

My experience has instilled in me a strong desire to solve difficult challenges with innovative data-driven solutions. I’m enthusiastic about projects that benefit the community, and I’m eager to assist businesses in developing better products by using data to derive insights.

Stacks:

  • Programming Languages: Python, Java, JavaScript, Shell Script, SQL
  • Database Management Tools: ElasticSearch, Weaviate MySQL, Microsoft SQL Server, MongoDB
  • Technology: RAG, Microsoft Azure, Nginx, Adobe Experience Manager, MuleSoft, Docker, AWS, Git, Tableau
  • Python Packages: Pandas, NumPy, Re, Pytorch, Transformers, Scikit-learn

Experience

PiSrc

Full Stack Software Engineer

February 2022 - Present

Gain hands-on experience on software development and data pipeline implementation

  • AI Chatbot & RAG Systems:
    • Led full-cycle development of a production-grade RAG conversational AI chatbot using Azure OpenAI, Redis, and Weaviate vector database; architected sticky session routing and load balancing to support 1K+ DAUs
    • Designed multi-turn conversational memory with Redis persistence and context window management; built agentic workflows with function calling to orchestrate custom tools and external APIs
    • Implemented security guardrails including input sanitization, content filtering, and PII masking to ensure safe and compliant AI interactions
    • Delivered real-time AI Overview and autosuggest powered by live user queries with caching layer for low-latency responses; built scheduled report pipelines and user feedback loops to continuously improve relevance and quality
    • Engineered scheduled multilingual (I18N) indexing pipelines across 10+ heterogeneous data sources, combining keyword-based and semantic hybrid search with semantic reranking, multi-channel query routing, query expansion, and iterative retrieval
  • Infrastructure & Platform Engineering:
    • Architected scalable full-stack infrastructure: VM provisioning, runtime orchestration, and offline pipelines for knowledge base synchronization and cache optimization
    • Maintained high reliability and low-latency performance through proactive monitoring and tuning
    • Applied data-driven insights to evolve CMS architecture and scale web platforms
  • Data Integration & Search Optimization:
    • Designed 7+ MuleSoft API integration workflows for incremental delta updates of partner accounts and locations
    • Integrated multi-source data into a Solr and Elasticsearch–based search layer with cache optimization
  • Personalization & Machine Learning Pipelines:
  • Content Platform (AEM):
    • Developed licensable software modules on Adobe Experience Manager to streamline content authoring and multi-channel publishing

Stevens Institute of Technology

Research Assistant/Teaching Assistant, Deep Learning and Web Analytics

June 2020 - December 2021

Language features pattern detection with Deep Learning models, data parsing and statistics modeling

  • Cleaned and structured 94,581 earnings call transcripts to explore 96 language factors influencing stock return
  • Achieved 72% accuracy on text classification with domain-adapted BERT language model using Pytorch
  • Conducted text analysis, sentiment analysis, and text mining utilizing NLP techniques with SpaCy and NLTK
  • Set up and managed a remote GPU environment on Ubuntu for Machine Learning and Deep Learning tasks
  • Developed tutorials on implementing deep learning models using related Python packages

Education

Stevens Institute of Technology

MSc in Data Science (GPA 3.8/4.0)

September 2019 - December 2021

Relevant Coursework: Statistical Methods, Statistical Inference, Advanced Optimization Methods, Advanced Data Analytics & Machine Learning, Deep Learning, Natural Language Processing, Web Analytics, Database Management Systems, Web Programming, Data Structures & Algorithms

Scholarship: Provost’s Scholarship, 2019

Guangzhou University

BSc in Mathematics and Applied Mathematics

September 2014 - May 2018

Relevant Coursework: Probability and Mathematical Statistics, Operational Research, Numerical Analysis, Advanced Algebra, Mathematical Analysis, Real Function Theory, Functional analysis, Ordinary Differential Equations, Partial Differential Equations

Awards: Second prize (top 6.3% out of 25,558 teams) in National Mathematical Modeling Contest, 2015

Projects

MyPlace Web Development

Develop a web application for furniture and rental information exchange

  • Led team of four students to design and develop a web application with Node.js and Express
  • Designed document schema on MongoDB and wrapped CRUD operations as RESTful APIs
  • Implemented Login and user-specific functions such as account signup and authentication system
  • Developed features for rental and furniture information exchange such as comments, search, dashboard

E-commerce Recommender System

Develop a web application for furniture and rental information exchange

  • Created and implemented recommender engine with users, items, and interaction records from JD.com
  • Integrated multiple memory-based and model-based collaborative filtering algorithms to make recommendations
  • Simulated on 7,000 pre-defined user-item interaction samples and attained 77% Top-10 Accuracy

Fintech Pitch Competition - 6th Position

Develop a mathematical model based on public data and metrics to measure and predict the vibrancy of the city in the U.S.

  • Constructed a vibrancy index to interpret and predict the prosperity trend of cities in the U.S.
  • Explored, collected, and blended data from Google POIs, Instagram, Zillow, Bureau of Labor Statistics with Pandas
  • Performed feature engineering on panel data, and fine-tuned models for prediction with XGBoost and Keras

Quantify the AI impacts on Jobs skills

Data scraping, Parsing and information extraction, topic modeling with Clustering algorithms and Neural Networks

  • Scraped, cleaned and structured Job Descriptions textual data from INDEED.com
  • Filtered noisy data by selecting clusters from K-means algorithm and visualized skillsets distribution
  • Categorized, analyzed skillsets trend after Topic Modeling and Aspect Extraction Deep Neural Network

Analysis on Factors Influencing Bitcoin

Feature engineering, Data analysis and modeling

  • Performed data cleaning and sentiment analysis for over 20 million related Tweets in second granularity
  • Delivered feature engineering for financial indicators like MACD and RSI etc
  • Analyzed, finetuned, and back-tested for over five types of deep learning model to increase the portfolio return

Familiar Python Packages

Data Scraping Data Manupulation Textual Data Processing Machine Learning and Deep Learning Data Visualization
Selenium, BeautifulSoup Pandas, Numpy, Regular Expression SpaCy, NLTK, Gensim Pytorch, HuggingFace Transformers, Scikit-Learn, Keras Matplotlib, Seaborn

Modeling Knowledge Base

Neural Networks NLP Regression Classification Dimension Reduction Clustering Emsembling techniques
Transformer, Convolutional Neural Networks, Variational Autoencoder Word2Vector, BERT, Transformer, Latent Dirichlet Allocation Linear Regression, Hinge/Lasso Regression, Time series regression(ARIMA) Support Vector Machine, Logistic Regression, Naive Bayes, Decision Tree, K-nearest Neighbors(KNN) Principal Component Analysis(PCA), Singular Value Decomposition(SVD) K-means, Hierarchical Clustering, DBSCAN Bagging, Boosting, Stacking

The sentences I love

[“What I cannot create, I do not understand”]