Product & Company Updates

Cracking the Code on Software Capitalization

Neo.Tax

February 5, 2026

Software capitalization under ASC 350-40 presents a complex compliance challenge for even the most sophisticated software companies. Many firms establish equally complex (and manual) processes to wrangle this problem, but most don’t have the resources to do so.

Why is it so difficult? The accounting standard itself isn’t the issue, because it’s fairly straightforward and well-defined: a firm must capitalize costs incurred during the application development stage, and expense costs attributable to preliminary project activities, post-implementation operations, and maintenance. In practice, however, applying these rules requires answering two questions that are harder than may seem initially:

Does a given software project qualify for capitalization?
If so, when does the capitalization period end? When does it go from active development to “maintenance mode”?

To answer the first, a company has to distinguish genuine feature development from maintenance activities, operational support, and non-development work. That may not sound too bad—most engineers and their managers can do this at any given tech company.

But can their accounting team do it? Ah, there’s the rub.

If they manage to get past the first hurdle, answering the second is even trickier. As most engineers will tell you, software projects rarely have discrete endpoints; development activities blend gradually into operations and maintenance, and identifying the transition point is inherently ambiguous.

If a company does decide to take the traditional approach—rely on the engineering managers to self-report capitalization percentages or retrospectively classify their team's work—it will likely introduce systematic bias and inconsistency. After all, engineers are not accountants, and they are not incentivized to spend their time accurately capitalizing projects. But that accuracy is critical for the business as a whole: capitalization affects reported R&D spending, EBITDA, and other metrics, potentially creating pressure that undermines objectivity.

To avoid these pitfalls, we decided to approach capitalization as a data-driven machine learning problem. Here’s how we did it.

Classification at the Atomic Level

The foundation of our system is a classifier that operates on individual work items—tickets, tasks, user stories—rather than projects or portfolios. Each work item is assigned to one of several internally defined classes that reflect whether the activity contributes to new product functionality, or maintains existing systems.

This granular, ticket-level analysis enables aggregation at any level—project, team, or time period—while preserving full auditability back to source records. Finance teams can trace high-level determinations directly to the underlying engineering activity, rather than relying on summary narratives or retrospective estimates.

‍

Fine-tuning an LLM for Classification

Text classification is conventionally framed as a "traditional ML" problem. Standard approaches include TF-IDF or embedding-based feature extraction followed by gradient boosting, random forests, or support vector machines, among other methods. More recent work employs encoder-only transformers like BERT or RoBERTa, fine-tuned on labeled examples.

We evaluated these approaches and found them insufficient for our domain. Engineering ticket text exhibits several characteristics that challenge traditional methods:

High Lexical Variance: The same type of work may be described in radically different ways across organizations, teams, and individuals.
Implicit Context: Tickets frequently reference external systems, internal jargon, and organizational context that requires semantic understanding beyond surface-level pattern matching.
Sparse Signal: Many tickets contain minimal text, requiring the model to extract maximum information from limited input.

Our solution was to fine-tune a large language model (LLMs) for sequence classification. While LLMs are most often thought of as chatbots, the underlying architecture can be adapted for tasks that are associated with more well-known machine learning problems. As the name suggests, these models involve potentially thousands of hours of training time, so fine-tuning them for a specific use case doesn’t require a massive pre-training step. Instead, techniques like Low-Rank Adaptation (LoRA) are an efficient but powerful way of shifting the underlying weights of the model to improve performance on a specific task while preserving the model's pre-trained semantic understanding.

The results we found from fine-tuning via LoRA were significant. Compared to our best-performing encoder-based baseline, the fine-tuned LLM achieved materially higher classification scores across all three classes, with particularly pronounced improvements on the minority classes (maintenance and internal operations) that are most critical for capitalization determination. The model’s richer contextual understanding allows it to correctly classify tickets that simpler approaches systematically mishandle.

‍

Determining Capitalizability: A Tiered Architecture

With per-ticket classifications in hand, determining project-level capitalizability becomes tractable. We employ a tiered approach that balances precision with computational efficiency:

Threshold-Based Determination

For projects with sufficient ticket data, we compute the distribution of work across our three categories. Projects exhibiting an overwhelming concentration in non-capitalizable categories—maintenance or internal operations—can be classified deterministically without requiring additional analysis. This fast-path determination handles a substantial percentage of projects, particularly those representing ongoing operational work.

LLM-Augmented Analysis

Projects that do not meet the threshold criteria for automatic determination proceed to a more nuanced analysis. Here, we employ large language models again, this time with structured prompting to evaluate project descriptions against ASC 350-40 criteria. The model produces both a determination and an accompanying rationale, providing the audit trail that finance teams require.

This hybrid architecture optimizes for both accuracy and cost. Deterministic classification handles clear cases efficiently, while reserving more expensive LLM inference for genuinely ambiguous situations.

‍

Identifying the Capitalization End Date: A Time-Series Approach

As previously described, determining when capitalization should terminate is arguably more challenging than determining whether a project qualifies. The accounting standard provides guidance—capitalization ends when the software is substantially complete and ready for its intended use—but mapping this definition to the continuous stream of development activity requires a principled methodology.

We model this as a time-series segmentation problem. For each project, we construct a daily signal reflecting how the mix of engineering activity evolves over time. This signal is smoothed to reduce noise, and gaps are handled systematically to avoid spurious conclusions.

Rather than relying on manually defined milestones, we apply statistical techniques designed to detect meaningful shifts in underlying patterns. These methods identify points where the character of the work changes in a sustained way, allowing us to partition a project’s history into distinct phases.

Our decision logic then evaluates those phases:

Persistent Non-Capitalizable Activity: If a project’s entire timeline is dominated by work that would not qualify for capitalization, it likely never entered a capitalizable development phase during the period analyzed.
Terminal Phase Identification: For projects with mixed patterns, we isolate the final sustained phase in which non-capitalizable work predominates. The start of that terminal phase serves as the capitalization end date.

This approach anchors end-date determinations in observed engineering behavior, rather than subjective project milestones or manager estimates.

A New Trend: the Rise of ML in Accounting

This LLM-enabled system that we’ve built for software capitalization is yet another example of a trend that we can’t help but notice here at Neo.Tax: the ever increasing value of machine learning in the world of finance and accounting. The shift from manual to ML-driven capitalization analysis has several practical implications for finance teams going forward:

Consistency: The same classification criteria apply uniformly across all projects, teams, and time periods. Variance attributable to individual judgment is eliminated.
Defensibility: Determinations are grounded in systematic analysis of actual work performed, documented in source systems of record. This represents a meaningful improvement over retrospective narratives constructed for accounting purposes.
Timeliness: Capitalization analysis can be performed continuously rather than quarterly or annually, enabling earlier identification of projects that may require reclassification.
Scalability: The analysis cost scales nearly linearly with project volume, making comprehensive coverage feasible even for organizations with large engineering footprints.

Software capitalization has historically been an area where accounting precision meets engineering ambiguity, with results that satisfy neither discipline. By applying modern NLP techniques, specifically fine-tuned large language models and time-series methods, Neo.Tax has created a data-driven ML system that resolves the historical ambiguity of ASC 350-40 software capitalization. This novel approach delivers a path toward determinations that are both defensible and scalable, pointing to a future of greater ML rigor in financial reporting.

Share this post

ASC350-40

IRS

software capitalization

R&D

Amortization

R&D Capitalization

Neo.Tax