Full-Stack Software Engineer at PiSrc

About Me

My experience has instilled in me a strong desire to solve difficult challenges with innovative data-driven solutions. I’m enthusiastic about projects that benefit the community, and I’m eager to assist businesses in developing better products by using data to derive insights.

Stacks:

Programming Languages: Python, Java, JavaScript, Shell Script, SQL
Database Management Tools: ElasticSearch, Weaviate MySQL, Microsoft SQL Server, MongoDB
Technology: RAG, Microsoft Azure, Nginx, Adobe Experience Manager, MuleSoft, Docker, AWS, Git, Tableau
Python Packages: Pandas, NumPy, Re, Pytorch, Transformers, Scikit-learn

Experience

PiSrc

Full Stack Software Engineer

February 2022 - Present

Gain hands-on experience on software development and data pipeline implementation

AI Chatbot & RAG Systems:
- Led full-cycle development of a production-grade RAG conversational AI chatbot using Azure OpenAI, Redis, and Weaviate vector database; architected sticky session routing and load balancing to support 1K+ DAUs
- Designed multi-turn conversational memory with Redis persistence and context window management; built agentic workflows with function calling to orchestrate custom tools and external APIs
- Implemented security guardrails including input sanitization, content filtering, and PII masking to ensure safe and compliant AI interactions
- Delivered real-time AI Overview and autosuggest powered by live user queries with caching layer for low-latency responses; built scheduled report pipelines and user feedback loops to continuously improve relevance and quality
- Engineered scheduled multilingual (I18N) indexing pipelines across 10+ heterogeneous data sources, combining keyword-based and semantic hybrid search with semantic reranking, multi-channel query routing, query expansion, and iterative retrieval
Infrastructure & Platform Engineering:
- Architected scalable full-stack infrastructure: VM provisioning, runtime orchestration, and offline pipelines for knowledge base synchronization and cache optimization
- Maintained high reliability and low-latency performance through proactive monitoring and tuning
- Applied data-driven insights to evolve CMS architecture and scale web platforms
Data Integration & Search Optimization:
- Designed 7+ MuleSoft API integration workflows for incremental delta updates of partner accounts and locations
- Integrated multi-source data into a Solr and Elasticsearch–based search layer with cache optimization
Personalization & Machine Learning Pipelines:
- Developed User-to-Item and Item-to-Item recommendation pipelines with rolling cache and Akamai CDN integration
- Built offline ML pipelines to optimize sales funnel conversion and marketing campaign targeting
Content Platform (AEM):
- Developed licensable software modules on Adobe Experience Manager to streamline content authoring and multi-channel publishing

Stevens Institute of Technology

Research Assistant/Teaching Assistant, Deep Learning and Web Analytics

June 2020 - December 2021

Language features pattern detection with Deep Learning models, data parsing and statistics modeling

Cleaned and structured 94,581 earnings call transcripts to explore 96 language factors influencing stock return
Achieved 72% accuracy on text classification with domain-adapted BERT language model using Pytorch
Conducted text analysis, sentiment analysis, and text mining utilizing NLP techniques with SpaCy and NLTK
Set up and managed a remote GPU environment on Ubuntu for Machine Learning and Deep Learning tasks
Developed tutorials on implementing deep learning models using related Python packages

Education

Stevens Institute of Technology

MSc in Data Science (GPA 3.8/4.0)

September 2019 - December 2021

Relevant Coursework: Statistical Methods, Statistical Inference, Advanced Optimization Methods, Advanced Data Analytics & Machine Learning, Deep Learning, Natural Language Processing, Web Analytics, Database Management Systems, Web Programming, Data Structures & Algorithms

Scholarship: Provost’s Scholarship, 2019

Guangzhou University

BSc in Mathematics and Applied Mathematics

September 2014 - May 2018

Relevant Coursework: Probability and Mathematical Statistics, Operational Research, Numerical Analysis, Advanced Algebra, Mathematical Analysis, Real Function Theory, Functional analysis, Ordinary Differential Equations, Partial Differential Equations

Awards: Second prize (top 6.3% out of 25,558 teams) in National Mathematical Modeling Contest, 2015

Projects

MyPlace Web Development

Develop a web application for furniture and rental information exchange

Led team of four students to design and develop a web application with Node.js and Express
Designed document schema on MongoDB and wrapped CRUD operations as RESTful APIs
Implemented Login and user-specific functions such as account signup and authentication system
Developed features for rental and furniture information exchange such as comments, search, dashboard

E-commerce Recommender System

Develop a web application for furniture and rental information exchange

Created and implemented recommender engine with users, items, and interaction records from JD.com
Integrated multiple memory-based and model-based collaborative filtering algorithms to make recommendations
Simulated on 7,000 pre-defined user-item interaction samples and attained 77% Top-10 Accuracy

Fintech Pitch Competition - 6th Position

Develop a mathematical model based on public data and metrics to measure and predict the vibrancy of the city in the U.S.

Constructed a vibrancy index to interpret and predict the prosperity trend of cities in the U.S.
Explored, collected, and blended data from Google POIs, Instagram, Zillow, Bureau of Labor Statistics with Pandas
Performed feature engineering on panel data, and fine-tuned models for prediction with XGBoost and Keras

Quantify the AI impacts on Jobs skills

Data scraping, Parsing and information extraction, topic modeling with Clustering algorithms and Neural Networks

Scraped, cleaned and structured Job Descriptions textual data from INDEED.com
Filtered noisy data by selecting clusters from K-means algorithm and visualized skillsets distribution
Categorized, analyzed skillsets trend after Topic Modeling and Aspect Extraction Deep Neural Network

Analysis on Factors Influencing Bitcoin

Feature engineering, Data analysis and modeling

Performed data cleaning and sentiment analysis for over 20 million related Tweets in second granularity
Delivered feature engineering for financial indicators like MACD and RSI etc
Analyzed, finetuned, and back-tested for over five types of deep learning model to increase the portfolio return

Familiar Python Packages

Data Scraping	Data Manupulation	Textual Data Processing	Machine Learning and Deep Learning	Data Visualization
Selenium, BeautifulSoup	Pandas, Numpy, Regular Expression	SpaCy, NLTK, Gensim	Pytorch, HuggingFace Transformers, Scikit-Learn, Keras	Matplotlib, Seaborn

Modeling Knowledge Base

Neural Networks	NLP	Regression	Classification	Dimension Reduction	Clustering	Emsembling techniques
Transformer, Convolutional Neural Networks, Variational Autoencoder	Word2Vector, BERT, Transformer, Latent Dirichlet Allocation	Linear Regression, Hinge/Lasso Regression, Time series regression(ARIMA)	Support Vector Machine, Logistic Regression, Naive Bayes, Decision Tree, K-nearest Neighbors(KNN)	Principal Component Analysis(PCA), Singular Value Decomposition(SVD)	K-means, Hierarchical Clustering, DBSCAN	Bagging, Boosting, Stacking

The sentences I love

[“What I cannot create, I do not understand”]

Wei Yang