
Covering All the Bases: Type-Based Verification of Test Input Generators by matt_d
Abstract
Test input generators are an important part of property-based testing (PBT) frameworks. Because PBT is intended to test deep semantic and structural properties of a program, the outputs produced by these generators can be complex data structures, constrained to satisfy properties the developer believes is most relevant to testing the function of interest. An important feature expected of these generators is that they be capable of producing all acceptable elements that satisfy the function’s input type and generator-provided constraints. However, it is not readily apparent how we might validate whether a particular generator’s output satisfies this coverage requirement. Typically, developers must rely on manual inspection and post-mortem analysis of test runs to determine if the generator is providing sufficient coverage; these approaches are error-prone and difficult to scale as generators become more complex. To address this important concern, we present a new refinement type-based verification procedure for validating the coverage provided by input test generators, based on a novel interpretation of types that embeds “must-style” underapproximate reasoning principles as a fundamental part of the type system. The types associated with expressions now capture the set of values guaranteed to be produced by the expression, rather than the typical formulation that uses types to represent the set of values an expression may produce. Beyond formalizing the notion of coverage types in the context of a rich core language with higher-order procedures and inductive datatypes, we also present a detailed evaluation study to justify the utility of our ideas.
- Aws Albarghouthi, Loris D’Antoni, Samuel Drews, and Aditya V. Nori. 2017. FairSquare: Probabilistic Verification of Program Fairness. Proc. ACM Program. Lang., 1, OOPSLA (2017), Article 80, oct, 30 pages. https://doi.org/10.1145/3133904
- Osbert Bastani, Xin Zhang, and Armando Solar-Lezama. 2019. Probabilistic Verification of Fairness Properties via Concentration. Proc. ACM Program. Lang., 3, OOPSLA (2019), Article 118, Oct, 27 pages. https://doi.org/10.1145/3360544
- Koen Claessen. 2020. QuickCheck. https://hackage.haskell.org/package/QuickCheck
Google Scholar - Koen Claessen, Jonas Duregård, and Michał H. Pał ka. 2014. Generating Constrained Random Data with Uniform Distribution. Journal of Functional Programming, 25, isbn:978-3-319-07150-3 https://doi.org/10.1007/978-3-319-07151-0_2
- Koen Claessen and John Hughes. 2000. QuickCheck: A Lightweight Tool for Random Testing of Haskell Programs. In Proceedings of the Fifth ACM SIGPLAN International Conference on Functional Programming (ICFP ’00). Association for Computing Machinery, New York, NY, USA. 268–279. isbn:1581132026 https://doi.org/10.1145/351240.351266
- Leonardo de Moura and Nikolaj Bjørner. 2008. Z3: An Efficient SMT Solver. In Tools and Algorithms for the Construction and Analysis of Systems, C. R. Ramakrishnan and Jakob Rehof (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg. 337–340. isbn:978-3-540-78800-3 https://doi.org/10.1007/978-3-540-78800-3_24
- Edsko de Vries and Vasileios Koutavas. 2011. Reverse Hoare Logic. In Software Engineering and Formal Methods, Gilles Barthe, Alberto Pardo, and Gerardo Schneider (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg. 155–171. isbn:978-3-642-24690-6 https://doi.org/10.1007/978-3-642-24690-6_12
- Stephen Dolan. 2022. Crowbar. https://github.com/stedolan/crowbar
Google Scholar - Jana Dunfield and Neel Krishnaswami. 2021. Bidirectional Typing. ACM Comput. Surv., 54, 5 (2021), Article 98, may, 38 pages. issn:0360-0300 https://doi.org/10.1145/3450952
- Peter Dybjer, Qiao Haiyan, and Makoto Takeyama. 2003. Combining Testing and Proving in Dependent Type Theory. In Theorem Proving in Higher Order Logics, 16th International Conference, TPHOLs 2003, Rom, Italy, September 8-12, 2003, Proceedings, David A. Basin and Burkhart Wolff (Eds.) (Lecture Notes in Computer Science, Vol. 2758). Springer, 188–203. https://doi.org/10.1007/10930755_12
- Peter Dybjer, Qiao Haiyan, and Makoto Takeyama. 2004. Random Generators for Dependent Types. In Theoretical Aspects of Computing – ICTAC 2004, First International Colloquium, Guiyang, China, September 20-24, 2004, Revised Selected Papers, Zhiming Liu and Keijiro Araki (Eds.) (Lecture Notes in Computer Science, Vol. 3407). Springer, 341–355. https://doi.org/10.1007/978-3-540-31862-0_25
- 2022. fast-check: Property based testing for JavaScript and TypeScript. https://dubzzz.github.io/fast-check.github.com/
Google Scholar - Burke Fetscher, Koen Claessen, Michal H. Palka, John Hughes, and Robert Bruce Findler. 2015. Making Random Judgments: Automatically Generating Well-Typed Terms from the Definition of a Type-System. In Programming Languages and Systems – 24th European Symposium on Programming, ESOP 2015, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2015, London, UK, April 11-18, 2015. Proceedings, Jan Vitek (Ed.) (Lecture Notes in Computer Science, Vol. 9032). Springer, 383–405. https://doi.org/10.1007/978-3-662-46669-8_16
- Cormac Flanagan, Amr Sabry, Bruce F. Duba, and Matthias Felleisen. 1993. The Essence of Compiling with Continuations. In Proceedings of the ACM SIGPLAN 1993 Conference on Programming Language Design and Implementation (PLDI ’93). Association for Computing Machinery, New York, NY, USA. 237–247. isbn:0897915984 https://doi.org/10.1145/155090.155113
- Kimball Germane and Jay McCarthy. 2021. Newly-single and Loving It: Improving Higher-Order Must-Alias Analysis with Heap Fragments. Proc. ACM Program. Lang., 5, ICFP (2021), 1–28. https://doi.org/10.1145/3473601
- Patrice Godefroid, Aditya V. Nori, Sriram K. Rajamani, and Sai Deep Tetali. 2010. Compositional May-Must Program Analysis: Unleashing the Power of Alternation. In Proceedings of the 37th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’10). Association for Computing Machinery, New York, NY, USA. 43–56. isbn:9781605584799 https://doi.org/10.1145/1706299.1706307
- John Hatcliff and Olivier Danvy. 1994. A Generic Account of Continuation-Passing Styles. In Proceedings of the 21st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’94). Association for Computing Machinery, New York, NY, USA. 458–471. isbn:0897916360 https://doi.org/10.1145/174675.178053
- C. A. R. Hoare. 1969. An Axiomatic Basis for Computer Programming. Commun. ACM, 12, 10 (1969), oct, 576–580. issn:0001-0782 https://doi.org/10.1145/363235.363259
- 2022. Hypothesis. https://github.com/HypothesisWorks/hypothesis/tree/master/hypothesis-python
Google Scholar - Suresh Jagannathan, Peter Thiemann, Stephen Weeks, and Andrew K. Wright. 1998. Single and Loving It: Must-Alias Analysis for Higher-Order Languages. In POPL ’98, Proceedings of the 25th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, San Diego, CA, USA, January 19-21, 1998, David B. MacQueen and Luca Cardelli (Eds.). ACM, 329–341. https://doi.org/10.1145/268946.268973
- Ranjit Jhala and Niki Vazou. 2021. Refinement Types: A Tutorial. Found. Trends Program. Lang., 6, 3-4 (2021), 159–317. https://doi.org/10.1561/2500000032
- Leonidas Lampropoulos. 2018. Random Testing for Language Design. Ph. D. Dissertation. University of Pennsylvania. https://repository.upenn.edu/edissertations/2879
Google Scholar - Leonidas Lampropoulos, Diane Gallois-Wong, Cătălin Hriţcu, John Hughes, Benjamin C. Pierce, and Li-yao Xia. 2017. Beginner’s Luck: A Language for Property-Based Generators. SIGPLAN Not., 52, 1 (2017), jan, 114–129. issn:0362-1340 https://doi.org/10.1145/3093333.3009868
- Leonidas Lampropoulos, Michael Hicks, and Benjamin C. Pierce. 2019. Coverage Guided, Property Based Testing. Proc. ACM Program. Lang., 3, OOPSLA (2019), Article 181, oct, 29 pages. https://doi.org/10.1145/3360607
- Leonidas Lampropoulos, Zoe Paraskevopoulou, and Benjamin C. Pierce. 2018. Generating Good Generators for Inductive Relations. Proc. ACM Program. Lang., 2, POPL (2018), 45:1–45:30. https://doi.org/10.1145/3158133
- Leonidas Lampropoulos and Benjamin C. Pierce. 2022. QuickChick: Property-Based Testing in Coq (Software Foundations, Vol. 4). Electronic textbook. Version 1.3.1, https://softwarefoundations.cis.upenn.edu
Google Scholar - Quang Loc Le, Azalea Raad, Jules Villard, Josh Berdine, Derek Dreyer, and Peter W. O’Hearn. 2022. Finding Real Bugs in Big Programs with Incorrectness Logic. Proc. ACM Program. Lang., 6, OOPSLA (2022), 1–27. https://doi.org/10.1145/3527325
- Ashish Mishra and Suresh Jagannathan. 2022. Specification-Guided Component-Based Synthesis from Effectful Libraries. Proc. ACM Program. Lang., 6, OOPSLA2 (2022), Article 147, oct, 30 pages. https://doi.org/10.1145/3563310
- Peter W. O’Hearn. 2019. Incorrectness Logic. Proc. ACM Program. Lang., 4, POPL (2019), Article 10, dec, 32 pages. https://doi.org/10.1145/3371078
- Rohan Padhye, Caroline Lemieux, Koushik Sen, Mike Papadakis, and Yves Le Traon. 2019. Semantic Fuzzing with Zest. In Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis. Association for Computing Machinery, New York, NY, USA. 329–340. isbn:9781450362245 https://doi.org/10.1145/3293882.3330576
- Michał H. Pał ka, Koen Claessen, Alejandro Russo, and John Hughes. 2011. Testing an Optimising Compiler by Generating Random Lambda Terms. In Proceedings of the 6th International Workshop on Automation of Software Test (AST ’11). Association for Computing Machinery, New York, NY, USA. 91–97. isbn:9781450305921 https://doi.org/10.1145/1982595.1982615
- Zoe Paraskevopoulou, Cătălin HriŢcu, Maxime Dénès, Leonidas Lampropoulos, and Benjamin C. Pierce. 2015. Foundational Property-Based Testing. In Interactive Theorem Proving, Christian Urban and Xingyuan Zhang (Eds.). Springer International Publishing, Cham. 325–343. https://doi.org/10.1007/978-3-319-22102-1_22
- Vaughan R Pratt. 1976. Semantical consideration on Floyd-Hoare logic. In 17th Annual Symposium on Foundations of Computer Science (sfcs 1976). 109–121.
Google ScholarDigital Library
- Azalea Raad, Josh Berdine, Hoang-Hai Dang, Derek Dreyer, Peter O’Hearn, and Jules Villard. 2020. Local Reasoning About the Presence of Bugs: Incorrectness Separation Logic. In Computer Aided Verification: 32nd International Conference, CAV 2020, Los Angeles, CA, USA, July 21–24, 2020, Proceedings, Part II. Springer-Verlag, Berlin, Heidelberg. 225–252. isbn:978-3-030-53290-1 https://doi.org/10.1007/978-3-030-53291-8_14
- 2021. Crate for PBT in Rust. https://github.com/BurntSushi/quickcheck
Google Scholar - 2021. ScalaCheck. https://scalacheck.org/
Google Scholar - Eric L. Seidel, Niki Vazou, and Ranjit Jhala. 2015. Type Targeted Testing. In Programming Languages and Systems. Springer Berlin Heidelberg, Berlin, Heidelberg. 812–836. https://doi.org/10.1007/978-3-662-46669-8_33
- Niki Vazou, Eric L. Seidel, Ranjit Jhala, Dimitrios Vytiniotis, and Simon Peyton-Jones. 2014. Refinement Types for Haskell. SIGPLAN Not., 49, 9 (2014), aug, 269–282. issn:0362-1340 https://doi.org/10.1145/2692915.2628161
- Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and Understanding Bugs in C Compilers. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’11). Association for Computing Machinery, New York, NY, USA. 283–294. isbn:9781450306638 https://doi.org/10.1145/1993498.1993532
- Michal Zalewski. 2020. American Fuzzy Lop. https://github.com/google/afl
Google Scholar - Zhe Zhou, Robert Dickerson, Benjamin Delaware, and Suresh Jagannathan. 2021. Data-driven abductive inference of library specifications. Proc. ACM Program. Lang., 5, OOPSLA (2021), 1–29. https://doi.org/10.1145/3485493
- Zhe Zhou, Ashish Mishra, Benjamin Delaware, and Suresh Jagannathan. 2023. Covering All the Bases: Type-Based Verification of Test Input