AI Startups in Pathology: A Meta-Review

Despite remaining largely unchanged for the past 150 years, pathology continues to serve as a critical first diagnosis point for a vast majority of cancers. Another defining feature of pathology is its underwhelming uptake of digitization. I was surprised to learn that under 20% of labs in the US have gone digital and transitioned away from viewing slides under microscopes. This is reflected in a relatively small AI venture space with just under 20 startups, compared to over a 100 in radiology and over 200 in drug discovery. In this article, we will start by exploring pathology, its workflow, and its current state of digitization. We will then analyze 16 startups and map their AI efforts across 3 focus areas: laboratory operations, clinical decision support, and research & development. We will also discuss the workflow management tools and imaging viewers being built around AI models, as well as some auxiliary offerings. Finally, we will identify AI opportunities beyond image interpretation, and how digitization and AI can mature alongside one another to produce tightly integrated pathology solutions.

Patient Story

An abnormal finding is detected on a mammogram during routine annual screening. To investigate further, the physician orders a biopsy and a tissue sample is extracted from the suspicious area. The sample is sent out to the pathology lab where technicians start the manual process of sample preparation. This entails fixating the sample, embedding it into paraffin, and sectioning it into thin slices. The slices are stained with chemicals to colorize and highlight certain cells and tissue types. A pathologist will then view it under a microscope. If cancerous, the appearance of the abnormal cells and their spread will be assessed among other features. All this information is captured in a report that is communicated back to the physician. This entire process from biopsy to diagnostic report will take an average of 10 days in the US^↗︎, a nerve-racking period during which our patient is eagerly waiting to hear back about her breast cancer diagnosis.

The pathology workflow from biopsy to image interpretation. After the slides are prepared, they are either viewed under microscopes, or scanned and digitized for viewing on computer monitors. Most labs today still do the former.

Pathology, Pathologists, and Slides

Pathology is a branch of medical science that determines the presence and extent of a disease. This is done through visually evaluating how healthy tissue is perturbed by pathological processes^↗︎. In fact, the way this evaluation is conducted has seen very little change over the past 150 years. This includes everything from sample preparation methods to viewing slides under an optical microscope. For instance, the haematoxylin and eosin (H&E) stain combination - the most widely used stain and often considered a gold standard^↗︎ - has not changed since it was first introduced in 1876^↗︎. These well-established and trusted pathology protocols continue today to serve as the first diagnosis point for most cancers and other diseases, while complementing multiple areas of study including necrosis (cell death), inflammation, and wound healing. Pathology also extends beyond tissue to examine bodily fluids (e.g. blood, urine) and the whole body through autopsies^↗︎.

Pathologists are often referred to as “the doctor’s doctor” in reference to how diagnoses are first assessed by pathologists and then reported to other physicians^↗︎. There are roughly 21,000 pathologists in the US^↗︎ (5.7 per 100,000^↗︎), and a similar number in China while serving 4 times the population^↗︎. These numbers point to an exponentially growing shortage (18% decrease from 2007 to 2017 in the US^↗︎), especially as cancer cases are on the rise and many pathologists look to retirement. A smaller workforce handling more cases^↗︎ is reflected in longer lab turnaround times and associated physician burnout - which can have dire consequences on diagnostic accuracy. Pathologists are also often considered as “lab administrators” and an overworked pathologist may lead to deficiencies in carrying out this role^↗︎.

In addition to workforce shortages, the general subjectivity of image interpretation is another characteristic of everyday pathology workflows. As is the case with other image-based medical specialities (e.g. radiology), slide interpretation relies heavily on the pathologist’s experience and even their mental state to some extent. Diagnostic disagreements among pathologists are not uncommon and can reach a rate of 11%, with difficulties distinguishing disagreements from errors^↗︎. Another aspect of pathology images is the sheer number of slides per sample that must be examined, making for a cumbersome and error-prone process. A colectomy (surgical removal of the colon) sample may generate up to many dozen slides significantly increasing the risk of missing important findings^↗︎. Obtaining a second opinion on a given case is only feasible if reviewers are physically in the same lab to view the slide under the microscope. As glass slides are archived and go into long term storage, it becomes extremely difficult for pathologists to compare cases and identify similarities across samples and cohorts.

Going Digital & Whole Slide Imaging

Digital pathology has continuously promised to revolutionize the pathologist’s workflow for the past 30 years^↗︎. This shift will ultimately address many of the aforementioned issues and enable functionalities that we take for granted today when manipulating digital data e.g. annotating images with text descriptions or sending images to remote experts for a second opinion. Despite these improvement opportunities, labs have been slow in ditching their microscopes for slides displayed on computer monitors. Many factors have been cited as reasons behind this lag including high digital migration costs, data storage requirements, as well as the need to retrain personnel^↗︎. Today, some estimate that 20% and 1% of US labs use digital pathology for secondary and primary diagnosis respectively^↗︎. In addition to streamlining workflows, going digital is also a prerequisite for any subsequent computational image analysis pipeline. These include AI-based tools that require large amounts of digital data for model development, as well as during production. This underwhelming uptake in digitization makes it rather challenging to envision how AI will generally impact pathology workflows. It is also often cited as a reason behind the limited applications of computational pathology (the “omics” or “big data” approach to pathology)^↗︎.

Whole Slide Imaging (WSI) systems are the main enablers of digital pathology. Earlier versions from the 1970’s started by displaying microscope images on oldstyle cathode-ray tube TVs. They eventually matured as cameras became an integral part of digital microscopes, and images - or virtual slides - were sent directly to computers. Today, WSI systems comprise image acquisition scanners as well as display and management software - together with associated communication and storage systems. Just 3 years ago in 2017, the FDA approved their first ever WSI system developed by Philips for primary diagnosis^↗︎, with a second approval in 2019 for Leica^↗︎. While more are expected, it is clear that WSI systems are still in their infancy, especially as the FDA itself is working on developing the evaluation criteria for these devices^↗︎. It is also clear that digitization and WSI system adoption rates ultimately need to reach critical mass to support an AI ecosystem.

AI pathology startups timeline — A timeline of major events related to digital pathology, associated deep learning applications, as well as the founding of 16 startups analyzed here.

AI Solutions for Pathology

The first studies to apply deep learning to WSI data appeared in 2015^↗︎. Around the same time, many incumbents - previously providing image analysis pipelines for pathology slides - have rebranded including Visiopharm^↗︎ with an “app store”-like offering^↗︎, ContextVision^↗︎, Indica Labs^↗︎, and Aira Matrix^↗︎. Concurrently, multiple startups with AI as a key differentiator started to surface. The 16 startups analyzed here (founded 2013 onwards) work within 3 interconnected areas. The first area operates at the operational laboratory level, and ultimately enables the two other areas: clinical decision support and research & development.

Laboratory Operations
AI applications in this area tend to focus on increasing lab efficiency, quality control, and image management. Being highly operational, these applications are perhaps the least exciting of the three. Nevertheless, they are likely to have the greatest and most immediate impact in the short term. They are also often advertised as “workflow tools” which may help bypass regulatory roadblocks in some jurisdictions. Examples of these applications include automated detection algorithms to prioritize and triage cases, highlight regions of interest as images are examined, or run tedious tasks such as cell counting. Other image similarity algorithms may be used to index and search images for certain patterns. Some startups in this area include Procsia working on “driving efficiency in high-volume labs”^↗︎, Deciphex with focus on “triage not diagnosis”^↗︎, as well as Techcyte serving niche veterinary pathology labs^↗︎.
Clinical Decision Support
This area focuses on the pathologist’s core clinical tasks: diagnosis and characterization. These may include classification models to identify malignant cells, predict their histology, and grade them based on how differentiated they are from surrounding healthy tissue. Early applications will start by providing a second opinion to pathologists, and it will likely be a while before they become fully autonomous. As in laboratory operations, these applications are also pathologist-facing, hinting at the importance of how and where they are integrated into the workflow. Some startups in this area include Paige with a focus on prostate cancer diagnostics^↗︎, as well as Qritive with the integration of pathology imaging with electronic health records (EHR)^↗︎.
Research & Development
AI applications in this area are geared towards developing imaging biomarkers i.e. identifying features of an image relevant to a given outcome. Analysis of these features can help in clinical trial recruitment, providing more tailored treatments, and developing companion diagnostics (tests that are co-developed with a drug to aid in selecting patients for treatment with that particular drug). These are perhaps the most exciting applications of the technology, and expectedly the most far behind from a translational standpoint. Startups must often partner with pharmaceutical companies or contract research organizations (CRO’s) to collaborate on these research projects and gain access to clinical trial data for model development. Some startups in this area include Deep Lens with patient-trial matching at time of diagnosis^↗︎, Aignostics with companion imaging diagnostics^↗︎, as well as Nucleai who are working on imaging biomarkers for immunotherapy response prediction^↗︎.

AI pathology startups review — A ternary plot with axes for 3 AI focus areas: laboratory operations, clinical decision support, and research & development. The placement of each startup corresponds with its perceived focus area.

It is no surprise that most AI solutions cater for laboratory operations. Labs are a logical starting point with immediate tangible needs. Startups can start there and ultimately grow to provide either diagnostic models or R&D tools. Additionally, some low-hanging AI tasks in this area (e.g. cell detection) can be adequately performed by traditional machine learning methods and do not require deep learning with its large training data requirements. Clinical decision support comes in second place with higher level applications often associated with more risk. If pathologists do not use AI to triage cases and run tedious tasks, they are unlikely to use it for diagnosis. Finally, the R&D area is the least explored yet. Imaging data paired with patient outcomes is perhaps the most scarce, and some pharma choose to develop these technologies in-house as they often have the resources needed.

As startups get closer to the center, especially between clinical decision support and R&D, there may be interesting opportunities to connect providers with pharma. In one direction, these startups could provide pharma with real world data and evidence, both needed to accelerate drug development and inform clinical trial design^↗︎. In the other direction, they would also be in a unique position to envision how the research they conduct may one day be implemented in the clinic.

The Platform and its Auxiliary Services

There is no place for standalone AI algorithms in clinical pathology. Given the infancy of digital pathology, much of the infrastructure needed to contain and serve them does not exist. While this may alleviate clinical integration headaches (healthcare IT is notoriously outdated), it puts startups in a unique position to establish their own turnkey solutions. As a result, most offerings are centered around workflow management software and not the AI as often advertised. Assuming a lab has gone digital, startups will provide a software product that comprises an image viewer for day-to-day case reviews, and may boost report generation, telepathology, and collaboration functionalities. AI components are then offered as “add-ons” or productivity tools. As digitized pathology images include very high levels of magnification, they are relatively large in size: 0.1GB for a 3-dimensional CAT scan vs 3GB for a pathology slide^↗︎. Startups also offer cloud storage to handle this data. Finally, we have the image acquisition hardware for scanning and digitizing glass slides. Startups will work towards being scanner-agnostic, but they will not attempt to make their own. While these scanners were traditionally developed by more established vendors (e.g. Philips^↗︎, Huron digital pathology^↗︎), there are new players (e.g. Morphle^↗︎) in this area providing labs with more options while also contributing to vendor fragmentation.

AI pathology startup offerings framework — A general diagram depicting the platform built by AI startups in pathology. It comprises a workflow management software and cloud storage that act as infrastructure for running AI models. Startups will integrate with image acquisition scanners, and may provide some auxiliary services.

In addition to the platform and its components, we are also seeing auxiliary services being offered by some startups.

Do-it-yourself AI
This functionality provides users with simple annotation tools that allows them to develop their own models. This concept is not entirely new and is offered by some open source software. For instance, cellprofiler^↗︎ allows you to “annotate” a few examples which would then be propagated across unseen images to perform tasks such as detecting and counting cells. While these tools may be fitting for research contexts where tweaks to AI models are often needed, it is unlikely they would work in a clinical lab setting. In order to understand how they work and where they fail, these tools require an upfront investment of time and energy. Given time constrains in clinical labs, this investment is unlikely to be made. Examples of startups offering these software features include Aiforia^↗︎, deepathology^↗︎, and Deciphex^↗︎.
Slide Digitization
To capitalize on the sheer amount of archived glass slides accumulated over the past decades as well as new slides prepared daily, a market for slide digitization has emerged. Startups including Medmian^↗︎, Deciphex^↗︎, and Qritive^↗︎ offer a service to digitize and store virtual slides. This positions them to curate very valuable repositories of retrospective slides, in addition to being in control of pipelines carrying future incoming data. Other companies (e.g. Histowiz^↗︎) are purely focused on digitization where you ship in slides and view them digitally a few days later. On one hand, this enables opportunities for automated image analysis services, results of which can be delivered to labs without any interruptions to clinical workflow. On the other hand, digitization-as-a-service may not be viable in the long term as more and more labs digitize their own slides.
Academic Offerings
The educational pathology sector was one of the earliest adopters of digitization serving residents as well as practitioners through continuous professional development programs^↗︎. Some startups offer services that allow educators to standardize course and exam materials, as well as improve accessibility by delivering content over the web. Despite its relatively small size, the educational pathology area may help expose students to the technology early on before they enter the workforce.

The multihead microscope. How pathologists were educated at some point in the not so distant past. Source: focusontoxpath.com

Tailwinds & Headwinds

The COVID-19 pandemic has given a huge boost to tele-medicine models and tele-pathology in particular. The arguments for going digital took center stage when it became necessary for pathologists to continue their work remotely without microscopes. In response, the FDA has issued guidance to expand the availability of digital pathology devices during this public health emergency^↗︎, allowing specific devices that are not FDA-cleared to be used clinically. It will be interesting to observe how this urgency in digitization will continue beyond the pandemic. As for AI tool adoption, much friction still exists. While it is common for startups to act as evangelists, startups in pathology must advocate for two technologies simultaneously: digital pathology and AI. This entails investing heavily in content creation through educational courses, workshops, blogs, and webinars. While this adds burden, it may also translate to stronger relationships with a more engaged and loyal user base.

Digitizing both retrospective and prospective glass slides is no easy feat. Slides must be tagged with identifying barcodes and rescanned multiple times if a quality threshold is not met. Given the large size of the resultant images, it is crucial to correctly set the scanner parameters beforehand (e.g. level of magnification, focus plane). For instance, glass slides with thicker sections often produce out-of-focus images, while sections that extend close to the edge of the slide may not be captured by the scanner. This manual data curation may contribute to a cumbersome process, especially for non-experienced technicians. Moreover, the file formats in which WSI data is saved often come with interoperability limitations. In contrast to radiology where virtually all images are saved in the DICOM format^↗︎, digital pathology is yet to converge on a single standard file format. Instead, proprietary vendor-specific formats dominate pathology today^↗︎, and many proposed standards are yet to be adopted by the community at large^↗︎.

As for AI models, access to high-quality annotated data is a major bottleneck. The type of WSI annotations needed is not often part of pathologists’ daily routine, and hence requires additional effort from the time-constrained experts^↗︎. The analysis of these multi-gigabyte images (~6B pixels per image^↗︎) also poses new challenges for deep learning. Slides are often broken down into smaller patches with different annotation approaches at the slide-level (faster but less granular) and at the patch-level (better but more laborious)^↗︎. The amount of noise inherent to WSI data may also negatively impact the robustness of AI solutions. Manual sample preparation practices can differ both within and across labs, resulting in inconsistent image features and making diagnoses more open to debate. Even the level of stain intensity is often driven by the pathologist’s personal preference^↗︎. Scanners used in digitization often provide varying optical appearances and pixel resolutions, and require color calibration to ensure visual consistency^↗︎. Artifacts on glass slides not addressed during digitization (e.g. handwritten text, tape, cracks) may also contribute to more noise^↗︎. All these factors point at the importance of a solid data curation pipeline.

Opportunities

Virtually all AI applications in pathology cater to the final image interpretation step of the workflow. Conversely, little work has been done to address the highly variable upstream processes of manual slide preparation. This will likely be an area of focus as more holistic AI solutions are proposed. We are already seeing AI research in image correction (e.g. stain normalization, color augmentation^↗︎^↗︎) as well as quality control for flagging slides that deviate from protocol. AI has also been used to digitally stain unlabeled slides and create so-called “virtual stains”^↗︎^↗︎. If truly equivalent to routine staining, this may significantly reduce variability across stains while also disrupting the slide preparation workflow. Perhaps the most interesting AI applications are in computationally de-staining slides^↗︎. This ability to go backwards may allow for greater experimentation with different stains, not to mention the flexibility in capitalizing on archived retrospective slides coupled with very valuable clinical outcome data^↗︎.

Other opportunities lie in aligning the throughput levels of different parts of the pathology workflow. The slowest by far is sample preparation (days), followed by digitization with some scanners capable of handling up to 200 slides at a time^↗︎ (hours), and finally AI-based image interpretation with the highest throughput (minutes). We are seeing some hardware innovation in automating sample preparation (e.g. Inveox^↗︎), as well as high-throughput methods (e.g. tissue microarrays or TMA^↗︎) becoming more mainstream^↗︎. Given that we are in the early days of large-scale adoption of both digitization and AI in pathology, it will be interesting to observe how these technologies will mature alongside one another. This may also bring along opportunities for tighter integrations across hardware (lab equipment) and software (workflow management, viewers, AI) when compared to other digitized medical disciplines. For instance, co-developing image viewers and AI tools has already led to the rise of so called “AI-native” viewers. These viewers - in addition to displaying images - are built to visualize model predictions, facilitate data annotation for model training, and provide feedback for model improvement.

Human factor considerations will enable smoother workflow transitions. Clinical software and hardware are often quite user-unfriendly, and few legacy pathology solutions exist today. As a result, there may be opportunities to start from a clean slate and demonstrate how the user interfaces and experiences (UI/UX) could look like in such products. In fact, lab efficiency can be enhanced exclusively through better UI/UX and ergonomics without the use of any AI. User-centric design will also play a crucial role in introducing new technologies to users. For example, we have seen how displaying real-time AI predictions through augmented reality in digital microscopes can help expose labs to AI tools before they transition to fully digital workflows^↗︎.

Pathologists and Patients

There is no doubt pathology will eventually become a digital discipline, the speed at which will directly impact the adoption rate of AI tools. Today, 3-dimensional tissue samples are reduced to a select number of 2-dimensional slides to aid pathologists in interpretation^↗︎. As more AI is assigned this task, there may be a time when this simplification becomes unnecessary and the computational analysis of entire 3-dimensional tissue samples becomes standard of care^↗︎. While digitization will substitute the microscope with a computer monitor, AI will drive efficiency and ensure greater focus on the tasks that matter. Maybe then pathologists will transition to a more central patient-facing role, a role they are very well positioned to take on^↗︎.