Get the Flash Stretch Assessment. Maximize Tiering to Offset Price Hikes. Learn How

Komprise 2025 AI Survey: AI, Data & Enterprise Risk Report

Komprise IT Survey: AI, Data & Enterprise Risk

AI Puts a Shadow on Enterprise AI as Risks Get Real

Komprise surveyed 200 IT directors and executives at U.S. enterprise organizations
of 1000 employees and larger. The purpose of the survey was to discover how IT
teams are preparing their unstructured data for AI and the challenges they are
facing.

The Komprise IT Survey: AI, Data & Enterprise Risk showed that:

  • Nearly 80% of organizations have experienced negative data incidences with generative AI – with 13% resulting in financial, customer or reputational damage.
  • The vast majority (90%) are concerned about shadow AI from a privacy and security standpoint, with 46% reporting that they are “extremely worried.
  • The greatest challenge in preparing unstructured data for AI is finding and moving the right data to locations for AI ingestion (54%) followed by a lack of visibility into data.

Download the report get all of the details along with the 5 key takeaways. 


What is shadow AI and why is it a growing enterprise security risk?

Shadow AI refers to the unauthorized and unsanctioned use of AI tools by employees, business units, or developers without centralized IT oversight or security review. It is the AI equivalent of shadow IT — the practice of using technology outside of approved channels — but with significantly higher data exposure risk because AI tools actively consume and process the data fed into them. Data-heavy enterprises are especially vulnerable to shadow AI because of vast amounts of unstructured data in shared drives, clouds, and personal folders, employees feeding sensitive data into tools like Claude, ChatGPT, Copilot, or private AI apps, and business units or developers deploying AI models without centralized oversight. Why it matters now:

  • The risk is real and hitting the bottom line — nearly 80% of IT leaders say their organization has experienced negative outcomes from employee use of Generative AI, including false or inaccurate results from queries (46%) and leaking of sensitive data into AI (44%); notably 13% say these poor outcomes have also resulted in financial, customer or reputational damage
  • Concern is near-universal — the vast majority (90%) are concerned about shadow AI from a privacy and security standpoint, with 46% reporting that they are extremely worried
  • The attack surface is unstructured data — most of the sensitive data that ends up in shadow AI tools originates from unstructured file shares, email archives, and document repositories that lack data classification, tagging, and governance controls
  • GenAI tools amplify the risk — free, widely available generative AI tools make it trivially easy for an employee to paste patient records, legal contracts, or proprietary IP directly into a public model without any organizational visibility or control

What did the Komprise IT Survey: AI, Data and Enterprise Risk find about the state of enterprise AI governance?

The Komprise IT Survey: AI, Data and Enterprise Risk surveyed 200 IT directors and executives at U.S. enterprise organizations of 1,000 employees and larger in April 2025 to discover how IT teams are preparing their unstructured data for AI and the challenges they face. The headline finding is that AI risk is no longer theoretical — it is causing measurable damage at a significant proportion of large enterprises. Key data points:

  • Damage is already happening — nearly 80% of organizations have experienced negative data incidences with generative AI, with 13% resulting in financial, customer, or reputational damage; the most common bad outcomes include false or inaccurate results from queries (46%) and leaking of sensitive data into AI (44%)
  • Shadow AI is the top concern — an overwhelming concern about shadow AI emerged as a dominant theme, with nearly half extremely worried about the security and compliance impact of unauthorized and unsanctioned use of AI tools
  • Data preparation is the hardest problem — the greatest challenge in preparing unstructured data for AI is finding and moving the right data to locations for AI ingestion (54%) followed by a lack of visibility into data across storage environments
  • AI infrastructure has overtaken cybersecurity in budget priority — despite tariff-driven economic uncertainty and rising hardware prices, AI infrastructure now outranks cybersecurity and cost control in IT budgets as leaders invest in fast, secure systems and AI-ready storage
  • Most plan to use data management to fight back — most organizations (75%) plan to use data management technologies to address risks from shadow AI, followed closely by AI discovery and monitoring tools (74%)

How can enterprise IT teams reduce shadow AI risk without blocking legitimate AI adoption?

The most effective response to shadow AI is not restriction but governance — giving employees sanctioned, safe AI tools while simultaneously ensuring that the sensitive data those tools can access has been classified, controlled, and protected upstream. Managing unstructured data at scale requires automation that can efficiently find, tag, and move curated datasets into AI pipelines and monitor workflows with auditing capabilities. The Komprise approach to shadow AI governance:

  • Classify before AI can access it — the Komprise Global Metadatabase continuously indexes all unstructured data across every NAS, cloud, and object storage silo, capturing sensitivity status, PII indicators, PHI flags, and custom classification tags; data that has been classified and tagged cannot silently find its way into unauthorized AI tools
  • Sensitive data detection at scaleKomprise Sensitive Data Management uses built-in PII and PHI scanners, custom regex, and keyword search to find sensitive content across petabyte-scale unstructured data estates; flagged files can be automatically moved to protected storage tiers, excluded from AI pipelines, or confined by policy
  • Smart Data Workflows enforce governance automatically — rather than relying on employees to make the right data decisions, Deep Analytics queries the Global Metadatabase and Smart Data Workflows act on the results, ensuring sensitive data is handled by policy before it reaches any AI pipeline, sanctioned or otherwise
  • Audit trails for compliance — every data classification, movement, and ingestion event is logged with complete lineage, supporting HIPAA, GDPR, and internal governance reviews; when a shadow AI incident is investigated, IT has a complete record of where sensitive data was and who had access to it
  • KAPPA for custom sensitivity detectionKAPPA data services extend sensitive data discovery to proprietary file formats that standard scanners miss, including DICOM medical images, genomics BAM files, and domain-specific documents, using serverless processing at petabyte scale

Why is finding and moving the right data the biggest obstacle to enterprise AI readiness, and how does Komprise solve it?

Moving the right data is the top AI data preparation challenge — cited by 54% of respondents. This reflects a structural problem that storage purchases alone cannot fix. Enterprise unstructured data is scattered across dozens of NAS systems, cloud object stores, and hybrid environments with no unified index, no consistent metadata, and no way to distinguish high-value AI-ready datasets from noise. The visibility gap is the root cause:

  • No unified view means no curation — without a single, queryable index of all unstructured data across all storage silos, IT teams cannot identify which files are relevant, recent, sensitive, or duplicated; they are forced into manual, ad hoc processes that do not scale to petabytes
  • The Global Metadatabase closes the gap — the Komprise Global Metadatabase continuously indexes standard and enriched metadata across every storage environment without moving a single file; this unified layer is what makes it possible to run a precise query — for example, all chest X-rays for male patients over 35 with a specific diagnosis — and reduce millions of files to thousands in seconds
  • Automation replaces manual curation — AI pipelines demand automation that can efficiently find, tag, and move curated datasets into AI pipelines and monitor workflows with auditing capabilities; these tools index data across environments and support governance with auditing capabilities; Komprise Smart Data Workflows deliver exactly this automation, operating continuously across the full file and object data estate
  • Finding the right data also means excluding the wrong dataKomprise Intelligent AI Ingest filters 70%+ of unstructured data noise including duplicates, outdated files, and irrelevant content before it reaches AI pipelines; this directly addresses the survey finding that false or inaccurate AI results are the most common negative outcome, since poor input data is the primary cause of poor AI output
  • Storage cost optimization and AI readiness are the same motion — the same Komprise analysis that identifies cold data candidates for Flash Stretch tiering also identifies which data is AI-relevant; organizations that address storage cost bloat with intelligent tiering simultaneously build the governed, classified data estate that AI readiness requires

How are enterprise IT teams balancing AI infrastructure investment against rising storage costs and economic uncertainty?

The survey captures a tension that every enterprise IT leader is managing right now: AI investment is accelerating despite, not because of, favorable economic conditions. On-and-off U.S. tariff policies have disrupted supply chains and resulted in higher prices on many products including those designed for the data center; despite this unpredictability in pricing and product sourcing, organizations are moving full steam ahead on AI. How IT leaders are navigating the squeeze:

  • AI infrastructure has become the top budget priority — AI infrastructure now outranks cybersecurity and cost control in IT budgets; despite economic pressure, leaders are investing in fast, secure systems and AI-ready storage to ensure data is prepared and protected for AI with proper compliance measures
  • Storage cost optimization funds AI investment — with flash and NAND prices rising 130% by end of 2026 according to Gartner, the fastest path to AI infrastructure budget is reclaiming the capacity already locked in cold, ungoverned data on expensive primary storage; Komprise Flash Stretch identifies and tiers this data transparently, freeing 70%+ of primary NAS capacity without a hardware purchase
  • The hidden cost of ungoverned AI — 13% of organizations report that negative AI outcomes have resulted in financial, customer, or reputational damage; the true cost of shadow AI incidents — regulatory fines, breach remediation, customer churn — often exceeds the cost of implementing proper AI data governance in the first place
  • GPU waste compounds the cost pressure — feeding ungoverned, unclassified unstructured data to AI models wastes expensive GPU compute on noise, duplicates, and irrelevant content; Komprise filters this noise before ingestion, reducing AI compute costs and improving model accuracy simultaneously
  • The right strategy balances both — balancing the priorities of AI infrastructure investment and preparing and managing data for AI can help organizations deliver safe, optimized AI services; Komprise addresses both simultaneously, cutting storage costs through intelligent tiering while building the governed, classified data foundation that AI pipelines require to produce accurate, trustworthy results