Arnab Kar

Lead Data Scientist at MITIFY+
📚 Lead Data Scientist | New York, New York, United States
Mutual Connections
Loading...
0 Publications
0 Followers
0 Following
2 Questions

👤 About

Skills & Expertise

C++ Java Financial Analysis Deep Learning Data Visualization Statistical Data Analysis Computational Finance Curve Fitting Agile Methodologies R (Programming Language) Machine Learning Algorithms Multivariate Statistics Business Analysis SQL Social Network Analysis Economic Data Analysis Exploratory Data Analysis Parallel Computing High Performance Computing (HPC) Semiconductor Fabrication Formal Verification Time Series Analysis Financial Modeling Analytical Modelling Natural Language Processing (NLP) Object-Relational Mapping (ORM) Computer Vision Neural Networks Computational Fluid Dynamics (CFD) Electrostatics Mathematical Modeling Statistical Modeling Partial Differential Equations Multi-channel Retail Data Structures Database Management System (DBMS)

Research Interests

Data Scientist Machine Learning Engineer Curve Fitting Quantitative Research Network Science Electrostatics Mathematical Modeling Operations Management

Connect With Me

💼 Experience

Lead Data Scientist

MITIFY+ · April 2023 - Present
  • Founding Engineer; Lead Data Scientist; cybersecurity, social-media misinformation monitoring ● 0-to-MVP in 4 months. IP generation (3 patents filed) and implementation: 1) PR management, 2) Brand impact estimation, 3) propaganda/risk ratings. ● NLP, Q/A ML models, Hierarchical Clustering, Event/Entity detection & disambiguition, Emotion-cause-pair NLP models, Vision-to-text models, Vertex-AI, Metabase. ● Investor, clients, partners relations. Scoped requirements for antivirus companies, media houses, and government (B2B, B2G). Promised €5.7M over 3 years by govt. Developing IP and technology stack for language misinformation (NLP) detection suite. ☆ Intellectual Property: Spearheaded the development of core IPs crucial for advanced data analysis and misinformation detection. ☆ Technology Development: Led creation of technology stacks that form the backbone of the platform’s offerings. Created all Intellectual Property. ☆ Stakeholder Management: Managed requirements across diverse sectors (internal and external).

Co-founder, ML and Data Engineering AI

Exchange Robotics (ExR) · May 2023 - November 2023
  • Gen AI for Illiquid credits. ☆ Intellectual Property: Developed technological solutions and IPs for the firm, catering to secondary credit market. ☆ Technology Development: Oversaw the creation of technological solutions enhancing platform capabilities for financial markets. ☆ Stakeholder Management: Engaged major stakeholders such as Moody’s and S&P, contributing significantly to partnership development and fundraising.

Machine Learning Scientist

Saks Fifth Avenue, US · June 2022 - February 2023
  • Demand forecasting, operations research, financial budgeting. ☆ Technology Development: Enhanced machine learning models for demand forecasting, increasing accuracy significantly. ☆ Business Implementation: Implemented multi-modal models that improved inventory management, directly affecting financial strategies and purchase decisions.

Deep Learning Engineer II

E Ink Corporation (Eink) · February 2021 - May 2022
  • ☆ Intellectual Property: Engineered simulations for ink display behaviors that served as foundational technologies for product development. (lead to 5 patents/trade-secrets) ☆ Technology Development: Accelerated the digital emulation of physical displays, reducing product delivery timelines. ☆ Stakeholder Management: Collaborated with a cross-functional team including executives and production managers to align technological developments with strategic business goals.

Senior Data Scientist

Sirion · May 2020 - February 2021
  • ☆ Technology Development: Utilized advanced NLP techniques for analyzing and managing legal contracts. ☆ Business Implementation: Improved data accessibility and operational efficiency by automating the extraction and mapping of contract details.

Pre Doctoral Intern

Institute of Science and Technology Austria (ISTA) · January 2018 - August 2018
  • Distributed Computing, Distributed Systems, Distributed Machine Learning.

Student Research Associate

IIT Kanpur (Indian Institute of Technology, Kanpur) · May 2017 - August 2018
  • Generative Adversarial Networks for image understanding, and representation learning. Multi-agent game theoretical modeling of neural models.

Student Research Intern

Indian Statistical Institute, Kolkata · May 2015 - December 2015
  • Modeling chip performance, and security analysis. Analyzed attack vectors with fabrication simulators. Modeling chip performance, and security analysis. Analyzed attack vectors with fabrication simulators. Skills: Formal Verification · C++ · Java · Data Science · Semiconductor Fabrication

Research Intern

IIT Kharagpur (Indian Institute of Technology) · May 2014 - December 2014
  • Economic Data Analysis · Java · Network Science · Machine Learning · High Performance Computing (HPC) · SQL · Social Network Analysis · Exploratory Data Analysis · Data Science · Parallel Computing

🎓 Education

Indian Institute of Information Technology Allahabad (IIIT-A)

B.Tech in Information Technology(IT) · 2025

Duke University (DU)

Ph.D. in Computer Science · 2018

🚀 Projects

Knowledge-graph based fact-checks and relationship-discovery in Language-Model generated content
Agency Name: Indian Institute Of Information Technology || Aug 2014 - Jun 2015
Using graph queries (on knowledge graph) to fact-check content, and verify logical claims (which might not immediately be in the knowledge graph). Other applicable use-cases could be:: ☆ Legal-tech use-case: Enable deductions and implications in corporate and legal communications, enabling better dispute handling. ☆ Insurance verification: claim checks, fact verification, policy-alignment checks. Tool would enable improved cash-flow, reduced administrative burden, improved transparency.
Provably-verified Data-security when performing data analysis on permission-ed data
Agency Name: Indian Institute Of Information Technology || Sep 2015 - Jun 2016
Used mathematical techniques to ensure data analysis on dis-aggregated data respects ownership and access-control rules set on data sources. Other use-cases could be:: ☆ Ensuring AI generated content, or AI-enabled knowledge discovery respects ownership rules of underlying data-source; not revealing confidential data. ☆ Ensure financial or corporate reporting does not leak business logic or confidential information: revealing information at the right level of granularity between organizational silo-s, up the organizational pyramid, and/or in public disclosures.
Distributed Data-Influence detection for (any) Machine Learning model (Interpretability, explainability)
Agency Name: Institute of Science and Technology Austria || Jan 2018 - Aug 2018
Developed methods that could explain model behavior, at scale, as it relates to underlying training data. Other similar use-cases could be: ☆ ML Trust and Safety use-cases: ensuring trust in models by attributing model behavior to data; de-biasing models from gendered/societal/cultural artifacts. ☆ De-biasing model-enabled decision-making on protected attributes (gender, race, etc).
Multi-agent Generative modeling
Agency Name: Indian Institute of Technology, Kanpur || May 2017 - Oct 2018
Generative Modeling to be able to *understand-ably* generate images (or scenarios). It could reason about the entities (people, things, objects, background) in generated image (and between images). Also applicable to other use-cases such as: ☆ Financial use-case: macro-economic scenario modeling to understand market dynamics with different *risk-leveled* agents/participants (macro-economic scenarios being one of the agent behavior). ☆ Business scenario modeling: impact on different lines of business, and bottom-line under different competitive dynamics, macro-economic conditions, and regulatory environments. ☆ Insurance use-cases: modeling the impact of different market conditions on different industry sectors/sub-sectors (and corresponding lines of underwriting applications).
Time (processor) and Space (memory) optimized Machine Learning hardware-accelerator scheduler
Agency Name: Duke University || Aug 2019 - May 2020
Designed, implemented, and tested scheduling techniques to optimally use hardware resources for Machine Learning payloads on tabular-databases (SQL queries, for example), while being under time-limits. Also applicable for following kinds of use-cases: ☆ Project Management: Time and resource optimization for complex engineering/supply-chain/industrial projects, with embedded risk management and contingency planning. ☆ Fleet management: using optimal resources (human resources, and capital), while respecting client requirements, and tolerance limits.

🏅 Certificates & Licenses (1)

Computational Microeconomics
Event: Computational Microeconomics · Duke University · Issued on February 2025
dd