My name is Mengye, I am a data analyst with a B.S. in applied mathematics from UC Davis and a certificate in data analytics from UC Berkeley and skills in Excel, VAB, Python, SQL, Machine Learning, Big Data and etc. I love the logic and beauty behind math, and I also love the story behind data. Able to start a project from zero and implement my skills to exceed the minimum requirement for impressive analysis. Recently completed a project in a team of four using data sets from kaggle, Python, JavaScript, CSS, Flask and Machine Learning to predict fake and true news. Projects using NLP(Natural Language Processing) to transform words to numerical numbers, and applied five methods to make predictions like random forest and neural networks. Great problem solver and deep learning critical thinking combined with collaborating across different groups makes me a valuable part to any team.
Collaborated with four data analysts to collect, organize, and analyze over sixty thousands lines of data. Used NLP(Natural Language Processing) by tokenizing, removing stopwords and lemmatizing to transform words into numerical numbers, and created an interactive web page to test any input. Applied over five different machine learning methods including Neural Network with Keras and Deep Learning to analyze and forecast the truth of new input news. Also, created a summarization webpage as a report. Troubleshooting technical issues, and refining processes while running Convolutional neural network method.
Tools used: Pandas, Google Colab, JavaScript, CSS, HTML5, Python, Flask, Big Data, Machine Learning, Joblib
Worked in a team of four to explore ten years of homelessness data in the US, including acquiring data from web scraping over eighty thousands data info, data cleanup, and transformation. Visualize data with bubble chart, dot chart, comparison line chart, radial bar chart, and Geo-mapping. Personally visualized datasets through bubble chart and search bar for nearby shelters by Zipcode. Also, developed and organized analytical reports within one website. Hardware crashing challenges by web scraping and scraped over 12 hours. Document relevant timelines as well as deliverables and present to a group of 25 data analysts.
Tools used: Regular Expression(re), HTML, CSS, JavaScript, Pandas, SQL,D3,Leaflet
Collaborated in a team of four to observe and analyze all languages spoken in the US by states with over seven-year of data which collected by using the API keys. Managed time and completed the project in less than 2 weeks by working remotely through Zoom and Slack. Visualization of line charts and heat maps by distributing fifty states into five different regions. Use hypothesis driven approach to drive setup and analysis of T-test on selected languages to test the relationship of growth and provide recommendation based on the test result.
Tools used: Pandas, Matplotlib, API keys
Worked individually to prove a conjecture in combinatorial number theory in mathematics states that only 1 appears infinite times in the Pascal’s Triangle. Observed and used mathematical logic to prove multiple lemmas and theorems before used. Troubleshooting and resolving problems with LaTex and proof of conjecture.
Tools used: LaTex
Completed and presented three different analytical projects in groups, created numerous visualizations to analyze the data through tableau,matplotlib,plotly, and so on. Also discover insights by creating meaningful tables through postgre sql,excel,and R.
Guided groups of 20+ youth students through activities such as arts and crafts, homework tutoring, and summer camp. Facilitated field trips all around the bay area, as from Fremont to San Francisco by public transportation.
Help the TA to run the discussion smoothly and explain theorems behind mathematical contents to one or more students.
Tutoring 2-4 students at a time in math problems and grade the paper works at the same time.