Software Engineering Researcher

Tapajit Dey

I study how software is built, maintained, reused, and evolved across organizations and open source ecosystems.

I am a Member of Technical Staff Researcher at the Software Engineering Institute at Carnegie Mellon University, working in the AI Workflow and Architecture Modernization group. My research connects empirical software engineering, software architecture, open source and InnerSource development, mining software repositories, and applied AI for software evolution.

Contact me at

Bio

Short bio

Tapajit Dey is a Member of Technical Staff Researcher at the Software Engineering Institute at Carnegie Mellon University, working in the AI Workflow and Architecture Modernization group. His current work includes applied research on software architecture, large-scale refactoring, software project selection, open source project categorization, and generative AI for software translation and library replacement.

He completed his Ph.D. in Computer Science at the University of Tennessee, Knoxville with Dr. Audris Mockus, where his research focused on applying data mining and empirical software engineering techniques to analyze software ecosystems and software supply chains. Before joining SEI, he was a Postdoctoral Researcher and later Research Fellow at Lero, the Science Foundation Ireland Research Centre for Software at the University of Limerick, working with Prof. Brian Fitzgerald on InnerSource and open source software development.

He was one of the founding members of the Lero Open Source Program Office. He received his Bachelor's and Master's degrees under the dual-degree program from the Indian Institute of Technology, Kharagpur in Electronics and Electrical Communication Engineering and worked at IBM India Software Lab for three years before starting his Ph.D.

Current role MTS Researcher, SEI-CMU
Research roots Empirical software engineering; data mining; software evolution
Prior industry experience IBM India Software Lab
Education Ph.D. UT Knoxville;
B.Tech. + M.Tech. IIT Kharagpur

Upcoming

Upcoming roles and accepted work.

  1. Artifacts Evaluation Co-Chair for the ICSE 2027 Artifact Evaluation Track.

  2. Accepted paper at ICSE 2026 SEET: Prompting Without Principles: Are Students Transferring Software Engineering Knowledge to LLM Use?

  3. Program committee member for the MSR 2026 Technical Papers Track.

  4. Program committee member for the ESEC/FSE 2026 Tool Demonstrations Track.

Research

Understanding and improving how software is built, maintained, and reused.

Software architecture automation

Applied research on architecture design, analysis, refactoring, and validation workflows for large and complex software systems, now in the context of AI workflow and architecture modernization.

Open source ecosystems

Large-scale empirical studies of open source supply chains, project selection, project health, contributor behavior, and ecosystem structure.

InnerSource and collaboration

Research on InnerSource adoption, incentives, newcomer pathways, and organizational practices for collaborative software development.

Projects

Current and recent funded research.

Scaling Code Translation with Generative AI

SEI-CMU project team member. Fiscal Year 2025 project, funded at $1.5M, investigating how generative AI can support software translation across languages at scale. Contributed to the funding proposal with James Ivers.

Shift Left with Generative AI

SEI-CMU project team member. Fiscal Year 2024 project, funded at $500K, exploring generative AI for automating large-scale library replacement. Contributed to the funding proposal with Ipek Ozkaya, James Ivers, and Robert Edman.

TREES Research Program

At Lero, led and contributed to work on InnerSource adoption and open source newcomers within the €6M Trustworthy Responsible Efficient Engineering of Software research program, in collaboration with industry partners.

Open Research Training in Ireland

Activity coordinator for TROPIC, a €190K National Open Research Forum-funded program to develop and roll out open research training across Ireland and across disciplines.

Teaching & Mentoring

Teaching, supervision, and research mentorship.

Teaching

  1. Fall 2020 - Spring 2023, University of Limerick Co-Instructor / Lecturer, CS4911: Introduction to Information Technology. Co-instructed with Prof. Brian Fitzgerald for first- and second-year undergraduates. Updated course evaluation for COVID restrictions, designed end-of-semester projects and rubrics, handled grading, and renovated the course to include cybersecurity, edge computing, quantum computing, current IT developments, and Universal Design for Learning guidelines.
  2. Universal Design for Learning training Completed Universal Design for Learning certification/training and used that training while renovating CS4911 at the University of Limerick.
  3. Spring 2017, University of Tennessee Teaching Assistant, Programming Languages and Systems. Graded weekly assignments and the midterm exam and delivered an introductory Python lecture.
  4. Fall 2016 and Fall 2015, University of Tennessee Teaching Assistant, Fundamentals of Digital Archeology. Graded assignments and supported class activities.
  5. Spring 2016, University of Tennessee Teaching Assistant, Operating Systems. Graded weekly assignments and the midterm exam.

Mentoring

  1. Doctoral co-supervision Co-supervised Robert Healy through doctoral completion, together with Prof. Brian Fitzgerald and Prof. Kieran Conboy, on scaling Scrum with stable queuing systems and mixed-method evaluation of agile systems.
  2. Doctoral supervision support Supervised Swarna Pundir for one year with Prof. Conor Ryan on AI models for fault prediction and debugging during the software development life cycle.
  3. Master's thesis supervision, Summer 2022 Supervised four Artificial Intelligence and Machine Learning master's projects on Twitter product-user feedback sentiment analysis, GitHub project text classification, Amazon product reviews and market basket analysis, and movie revenue prediction.
  4. Master's thesis supervision, Summer 2021 Supervised a master's project on detecting toxic comments on Stack Overflow using BERT.
  5. Thesis committees Served on thesis committees for master's and undergraduate students in artificial intelligence, machine learning, and software analytics.

Selected Publications

Representative publications and artifacts.

CROSS: A Contributor-Project Interaction Lifecycle Model for Open Source Software

Tapajit Dey, Brian Fitzgerald, and Sherae Daniel. HICSS 2025.

Smarter Project Selection for Software Engineering Research

Tapajit Dey, Jonathan Loungani, and James Ivers. PROMISE 2024.

Comparing Stability and Sustainability in Agile Systems

Robert Healy, Kieran Conboy, Tapajit Dey, Edwin Lewzey, and Brian Fitzgerald. XP 2024.

A Novel Technique to Assess Agile Systems for Stability

Robert Healy, Tapajit Dey, Kieran Conboy, and Brian Fitzgerald. XP 2023.

Knights and Gold Stars: A Tale of InnerSource Incentivization

Tapajit Dey, Willem Jiang, and Brian Fitzgerald. IEEE Software, 2022.

Representation of Developer Expertise in Open Source Software

Tapajit Dey, Andrey Karnauch, and Audris Mockus. ICSE 2021.

For the full and current publication record, see Google Scholar, DBLP, and my CV.

Publication Records

Publication and presentation records.

Work Venue Records
Why Program Increment Estimates Miss: Hidden Variables for Planning Software Excellence Alliance Monthly Tech Talk, February 2026 Video
CROSS: A Contributor-Project Interaction Lifecycle Model for Open Source Software HICSS 2025 Preprint
Smarter Project Selection for Software Engineering Research PROMISE 2024 DOI
Comparing Stability and Sustainability in Agile Systems XP 2024 Publication
Representation of Developer Expertise in Open Source Software ICSE 2021 DOI Preprint Slides Video Bib
Effect of Technical and Social Factors on Pull Request Quality for the NPM Ecosystem ESEM 2020 DOI Preprint Slides Video Bib
Detecting and Characterizing Bots That Commit Code MSR 2020 DOI Preprint Slides Video Bib
An Exploratory Study of Bot Commits ICSE Workshops 2020 DOI Preprint Slides Video Bib
Patterns of Effort Contribution and Demand in the NPM Ecosystem PROMISE 2019 DOI Preprint Slides Bib
Modeling Relationship between Post-Release Faults and Usage in Mobile Software PROMISE 2018 DOI Preprint Slides Bib
Are Software Dependency Supply Chain Metrics Useful in Predicting Change of Popularity of NPM Packages? PROMISE 2018 DOI Preprint Slides Bib

The earlier generated publication pages are preserved as source evidence in the full publication record and the dataset and preprint record.

Recognitions

Awards and recognitions with evidence links where available.

  1. YERUN Open Science Award

    Awarded to the Lero Open Science Committee by the Young European Research Universities Network. Award announcement

  2. Distinguished Reviewer Award, MSR 2022 Technical Track

    Award evidence

  3. Distinguished Reviewer Award, MSR 2021 Technical Track

    Award evidence

  4. Best Paper Award, PROMISE 2019

    Awarded for Patterns of Effort Contribution and Demand and User Classification based on Participation Patterns in NPM Ecosystem. Conference record

  5. Best Graduate Research Assistant

    University of Tennessee, Knoxville, College of Engineering.

Service

Professional service to the software engineering research community.

Leadership and chair roles

  1. Co-Chair ICSE Artifact Evaluation Track International Conference on Software Engineering.
  2. Co-Chair MSR Publicity and Social Media 20th Working Conference on Mining Software Repositories.
  3. Co-Chair / Web / Proceedings InnerSoft Leadership roles for the 1st International Workshop on InnerSource Software Development at ICSE.
  4. Co-Chair MSR Publicity and Social Media 19th Working Conference on Mining Software Repositories.
  5. Proceedings Chair PROMISE International Conference on Predictive Models and Data Analytics in Software Engineering.

Program committee and reviewing roles

  1. Program Committee ESEC/FSE Tool Demonstrations; MSR Technical Papers
  2. Program Committee ICSE Research Track; MSR Technical Papers; AIware; BoatSE
  3. Program Committee ICSE Research Track; MSR Technical Papers; PROMISE; ICSME Registered Reports
  4. Program Committee ICSE Technical, Posters, and Artifact Evaluation; MSR Mining Challenge; ESEC/FSE Artifacts; ICSME Artifact Evaluation and ROSE Festival; PROMISE
  5. Program Committee / Advisor MSR Technical Papers and Shadow PC Advisor
  6. Program Committee MSR Technical Papers; OSS Papers

Community, journal reviewing, and memberships

  • Panel service Panel member for the FM Vision 2030 workshop in Mexico City in 2023.
  • Journal reviewing Reviewer for Empirical Software Engineering, ACM TOSEM, IEEE TSE, and IEEE Software.
  • Professional memberships Member of ACM SIGSOFT, IEEE Computer Society, and InnerSource Commons.
  • Community building Founding member of the Lero Open Source Program Office and former member of the Lero Diversity Committee.