I am Akash Kumar,
a Data Scientist
at VMock India.

About
I am an IIT Kanpur B.Tech-M.Tech Dual Degree graduate with over four years of experience in Data Science & Analytics. I have completed several projects involving NER, document layout analysis, image segmentation, transparent 3D reconstruction, and predictor models using statistical modelling, machine learning and deep learning. I am currently employed as the Top Coder - Data Scientist at VMock India, contributing to cutting-edge R&D initiatives utilizing predictive analytics and Artificial Intelligence to develop a smart career analytics platform.
Download CVProgramming & Frameworks
- Python, C/C++
- Tensorflow
- Pytorch
- OpenCV
- Pandas
- NumPy
- Matplotlib
- Seaborn
- MySQL
- Opensearch
- Redis
DevOps & MLOps
- Git, Docker
- MLflow
- AWS Sagemaker
- Flask
- Celery
- uWSGI
- Devspace
- DVC
- S3
Experience
VMock India Pvt. Ltd.
Top Coder-Data Scientist
August 2022 - Present
Part of the R&D team at VMock; utilizing NLP & NLG to develop advanced parsing engines. Developed end-to-end graph based layout detection and merging module. Reduced complexity in text merging module and overall latency of the parser by 30%, through lightweight tree model. Improved NER by 5% and implemented document layout analysis using LayoutLM with combined image and text embeddings as input. Developed an end-to-end plagiarism detection module using Locality Sensitive Hashing and approximate-KNN search algorithm.
Education
Indian Institute of Technology
B.Tech-M.Tech Dual Degree in Mechanical Engineering
June 2022
I did my thesis ("Computer vision-based estimation of angles from 3D reconstruction") in the applications of computer vision in the field of surface engineering to measure the liquid contact angle by reconstruction of droplets from images. I have a minor in "Industrial & Management Engineering". I was also the coordinator of the Dance club IIT Kanpur and led institute team to victory in national-level and inter-IIT competitions.
Projects & Publications
Here are some of my projects I have done lately. Feel free to check them out.

Computer vision-based on-site estimation of contact angle from 3-D reconstruction of droplets
A novel computer vision-based estimation of liquid contact angle using 3D reconstruction of 30 microlitres droplet from non-orthogonal images acquired using a smartphone camera equipped with macro lens and custom-built setup. Our results showed an average error of 4% on comparison with state-of-art contact angle goniometer measurements.
- Publication
- Webpage

Estimation of planar angles from non-orthogonal imaging
A computationally less-expensive method of sparse reconstruction to estimate the planar angles using epipolar geometry and linear algorithms from non-orthogonal images acquired using a smartphone camera. This study can provide simplicity in the estimation of geometric parameters for opaque and Lambertian surfaces, used in on-site measurements of large deformations. The method was successful in estimating the planar angles in less than 10 s on non-curved edges with an average error of 3% by using only ten images.
- Publication
- Webpage

Analyze This 2020
Customer segmentation based on profitability to increase customer referral penetration in digital channels through high incentives. Implementation of greedy elimination for feature selection, SMOTE & Random Undersampler for class imbalance, and gradient boosting trees for prediction. We were one of the top 2 teams among the 55 teams from IIT Kanpur.
- Data Science Competition

FIFA19: Data Analysis
Data Cleaning and Exploratory data analysis of FIFA19 dataset using visual and descriptive statistics. Modelling of data with Regression model to predict the market value of the player using features like wage, Ratings and his potentials like shooting, defending, dribbling, etc. The R2 score for best model was 0.97. Classification of players according to the play zones gaining an accuracy of about 85 per cent.
- Exploratory data analysis

Cuisine Prediction
Use of Data Mining knowledge to classify different dishes according to their cuisines, given the ingredients. Use of Document Term Matrix and TF-IDF matrix in the transformation processes of data in preprocessing and application of ML algorithms like ANN, Random Forest, SVM, etc. for the classification. Wide application in food and catering sector.
- Data Mining

Sand Dunes
Proin gravida nibh vel velit auctor aliquet. Aenean sollicitudin, lorem quis bibendum auctor, nisi elit consequat ipsum, nec sagittis sem nibh id elit.
- Illustration

Minimalismo
Quisquam vel libero consequuntur autem voluptas. Qui aut vero. Omnis fugit mollitia cupiditate voluptas. Aenean sollicitudin, lorem quis bibendum auctor.
- Branding
- Product Design
Get In Touch
I would love to hear from you. Whether you have a question or just want to chat about data science — shoot me a message.