Thoth : Comprehensive Policy Compliance in Data Retrieval Systems


Data retrieval systems process data from many sources, each subject to its own data use policy. Ensuring compli-ance with these policies despite bugs, misconfiguration, or operator error in a large, complex, and fast evolving system is a major challenge. Thoth provides an effi-cient, kernel-level compliance layer for data use policies. Declarative policies are attached to the systems’ input and output files, key-value tuples, and network connec-tions, and specify the data’s integrity and confidential-ity requirements. Thoth tracks the flow of data through the system, and enforces policy regardless of bugs, mis-configurations, compromises in application code, or ac-tions by unprivileged operators. Thoth requires minimal changes to an existing system and has modest overhead, as we show using a prototype Thoth-enabled data re-trieval system based on the popular Apache Lucene.

Proceedings of the USENIX Security Symposium