Sound Call Graph Construction for Java Object: Conclusions, Acknowledgment, & References

cover
8 Feb 2024

This paper is available on arxiv under CC 4.0 license.

Authors:

(1) JOANNA C. S. SANTOS, University of Notre Dame, USA;

(2) MEHDI MIRAKHORLI, University of Hawaii at Manoa, USA;

(3) ALI SHOKRI, Virginia Tech, USA.

Table of Links

7 Conclusion

We presented an approach to support the static analysis of serialization-related features in Java programs. It works under the assumption that only classes in the classpath are serialized/deserialized, all of their instance fields are non-nulls and can be allocated with any type that is safe. By applying these assumptions and relying on API modeling, our approach adds synthetic nodes into a previously computed call graph to improve its soundness with respect to serialization-related features.

We evaluated our approach with respect to its soundness (RQ1), precision (RQ2), performance (RQ3), and usefulness for a downstream client analysis (RQ4). We used 9 programs from the CATS Test Suite [Reif et al. 2018] and 10 projects from the XCorpus dataset [Dietrich et al. 2017b]. We compared our approach soundness and precision against o#-the-shelf construction algorithms available on Soot [Vallée-Rai et al. 1999], Wala [IBM [n.d.]], OPAL [Eichberg and Hermann 2014] and Doop [Bravenboer and Smaragdakis 2009].

In our experiments, we found that only the call graphs that used CHA or RTA could (partially) infer the callback methods that could arise at runtime. Our approach, on the other hand, provided support for all the callback methods in the serialization and deserialization . In an analysis by comparing runtime call graphs with the statically build call graphs, our approach introduced less spurious edges. Finally, by measuring the running times of our approach, compared with its counterpart call graph construction algorithm (Salsa and Wala), we found that our approach did not incur significant overhead.

Acknowledgments

This material is based upon work supported by the National Science Foundation under Grant No. CNS-1816845 and Grant No. CCF-1943300. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation.

References

2023. TensorFlow. https://www.tensor"ow.org [Online; accessed 21. Oct. 2023].

Karim Ali, Xiaoni Lai, Zhaoyi Luo, Ondrej Lhotak, Julian Dolby, and Frank Tip. 2019. A Study of Call Graph Construction for JVM-Hosted Languages. IEEE Transactions on Software Engineering (2019), 1–1. https://doi.org/10.1109/TSE.2019.2956925

Karim Ali and Ondřej Lhoták. 2012. Application-only call graph construction. In European Conference on Object-Oriented Programming. Springer, 688–712.

Anastasios Antoniadis, Nikos Filippakis, Paddy Krishnan, Raghavendra Ramesh, Nicholas Allen, and Yannis Smaragdakis.

2020. Static analysis of Java enterprise applications: frameworks and caches, the elephants in the room.. In PLDI. 794–807.

Steven Arzt, Siegfried Rasthofer, Christian Fritz, Eric Bodden, Alexandre Bartel, Jacques Klein, Yves Le Traon, Damien Octeau, and Patrick McDaniel. 2014. FlowDroid: Precise Context, Flow, Field, Object-Sensitive and Lifecycle-Aware Taint Analysis for Android Apps. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (Edinburgh, United Kingdom) (PLDI ’14). ACM, New York, NY, USA, 259–269. https: //doi.org/10.1145/2594291.2594299

David F Bacon and Peter F Sweeney. 1996. Fast static analysis of C++ virtual function calls. In Proceedings of the 11th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications. 324–341. https://doi.org/10. 1145/236337.236371

Osbert Bastani, Rahul Sharma, Lazaro Clapp, Saswat Anand, and Alex Aiken. 2019. Eventually Sound Points-To Analysis with Speci!cations. In 33rd European Conference on Object-Oriented Programming (ECOOP 2019). Schloss DagstuhlLeibniz-Zentrum fuer Informatik. https://doi.org/10.4230/LIPIcs.ECOOP.2019.11

Eric Bodden, Andreas Sewe, Jan Sinschek, Hela Oueslati, and Mira Mezini. 2011. Taming Reflection: Aiding Static Analysis in the Presence of Reflection and Custom Class Loaders. In Proceedings of the 33rd International Conference on Software Engineering (ICSE’11). ACM, New York, NY, USA, 241–250. https://doi.org/10.1145/1985793.1985827

Martin Bravenboer and Yannis Smaragdakis. 2009. Strictly Declarative Specification of Sophisticated Points-to Analyses. In Proceedings of the 24th ACM SIGPLAN Conference on Object Oriented Programming Systems Languages and Applications (Orlando, Florida, USA) (OOPSLA ’09). Association for Computing Machinery, New York, NY, USA, 243–262. https: //doi.org/10.1145/1640089.1640108

Sicong Cao, Biao He, Xiaobing Sun, Yu Ouyang, Chao Zhang, Xiaoxue Wu, Ting Su, Lili Bo, Bin Li, Chuanlei Ma, et al. 2023. ODDFUZZ: Discovering Java Deserialization Vulnerabilities via Structure-Aware Directed Greybox Fuzzing. arXiv preprint arXiv:2304.04233 (2023).

Cristina Cifuentes, Andrew Gross, and Nathan Keynes. 2015. Understanding Caller-Sensitive Method Vulnerabilities: A Class of Access Control Vulnerabilities in the Java Platform. In Proceedings of the 4th ACM SIGPLAN International Workshop on State Of the Art in Program Analysis (Portland, OR, USA) (SOAP 2015). ACM, New York, NY, USA, 7–12. https://doi.org/10.1145/2771284.2771286

Ron Cytron, Jeanne Ferrante, Barry K. Rosen, Mark N. Wegman, and F. Kenneth Zadeck. 1991. Efficiently Computing Static Single Assignment Form and the Control Dependence Graph. ACM Trans. Program. Lang. Syst. 13, 4 (1991), 451–490. https://doi.org/10.1145/115372.115320

Jeffrey Dean, David Grove, and Craig Chambers. 1995. Optimization of object-oriented programs using static class hierarchy analysis. In European Conference on Object-Oriented Programming. Springer, 77–101. https://doi.org/10.1007/3-540-49538- X_5

Jens Dietrich, Kamil Jezek, Shawn Rasheed, Amjed Tahir, and Alex Potanin. 2017a. Evil Pickles: DoS Attacks Based on ObjectGraph Engineering. In 31st European Conference on Object-Oriented Programming (ECOOP 2017), Vol. 74. Schloss Dagstuhl– Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, 10:1–10:32. https://doi.org/10.4230/LIPIcs.ECOOP.2017.10

Jens Dietrich, Henrik Schole, Li Sui, and Ewan Tempero. 2017b. XCorpus – An executable Corpus of Java Programs. Journal of Object Technology 16, 4 (Aug. 2017), 1:1–24. https://doi.org/10.5381/jot.2017.16.4.a1

Julian Dolby, Mandana Vaziri, and Frank Tip. 2007. Finding bugs efficiently with a SAT solver. In Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering. 195–204.

Michael Eichberg. 2020. JCG - SerializableClasses. https://bitbucket.org/delors/jcg/src/master/jcg_testcases/src/main/ resources/Serialization.md. (Accessed on 06/01/2020).

Michael Eichberg and Ben Hermann. 2014. A Software Product Line for Static Analyses: The OPAL Framework. In Proceedings of the 3rd ACM SIGPLAN International Workshop on the State of the Art in Java Program Analysis (Edinburgh, United Kingdom) (SOAP ’14). Association for Computing Machinery, New York, NY, USA, 1–6. https://doi.org/10.1145/2614628. 2614630

William Enck, Peter Gilbert, Seungyeop Han, Vasant Tendulkar, Byung-Gon Chun, Landon P. Cox, Jaeyeon Jung, Patrick McDaniel, and Anmol N. Sheth. 2014. TaintDroid: An Information-Flow Tracking System for Realtime Privacy Monitoring on Smartphones. ACM Trans. Comput. Syst. 32, 2, Article 5 (June 2014), 29 pages. https://doi.org/10.1145/2619091

A. Feldthaus, M. Schäfer, M. Sridharan, J. Dolby, and F. Tip. 2013. Efficient construction of approximate call graphs for JavaScript IDE services. In 2013 35th International Conference on Software Engineering (ICSE). 752–761. https: //doi.org/10.1109/ICSE.2013.6606621

Yu Feng, Xinyu Wang, Isil Dillig, and Thomas Dillig. 2015. Bottom-Up Context-Sensitive Pointer Analysis for Java. In Programming Languages and Systems - 13th Asian Symposium, APLAS 2015, Pohang, South Korea, November 30 - December 2, 2015, Proceedings (Lecture Notes in Computer Science, Vol. 9458), Xinyu Feng and Sungwoo Park (Eds.). Springer, 465–484. https://doi.org/10.1007/978-3-319-26529-2_25

George Fourtounis, George Kastrinis, and Yannis Smaragdakis. 2018. Static analysis of Java dynamic proxies. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis. 209–220.

Chris Frohoff. 2018. frohoff/ysoserial: A proof-of-concept tool for generating payloads that exploit unsafe Java object deserialization. https://github.com/froho#/ysoserial. (Accessed on 05/26/2018).

David Grove and Craig Chambers. 2001. A framework for call graph construction algorithms. ACM Transactions on Programming Languages and Systems (TOPLAS) 23, 6 (2001), 685–746.

David Grove, Greg DeFouw, Jeffrey Dean, and Craig Chambers. 1997. Call graph construction in object-oriented languages. In Proceedings of the 12th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications (OOPSLA’97). ACM, New York, NY, USA, 108–124. https://doi.org/10.1145/263698.264352

Ian Haken. 2018. Automated Discovery of Deserialization Gadget Chains.

Nevin Heintze and Olivier Tardieu. 2001. Demand-driven pointer analysis. ACM SIGPLAN Notices 36, 5 (2001), 24–34. https://doi.org/10.1145/381694.378802

Michael Hind. 2001. Pointer analysis: Haven’t we solved this problem yet?. In Proceedings of the 2001 ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering. 54–61. https://doi.org/10.1145/379605.379665

Stephen Hines, Prasad Kulkarni, David Whalley, and Jack Davidson. 2005. Using De-Optimization to Re-Optimize Code (EMSOFT ’05). Association for Computing Machinery, New York, NY, USA, 114–123. https://doi.org/10.1145/1086228. 1086251

IBM. [n.d.]. T.J. Watson Libraries for Analysis (WALA). http://wala.sourceforge.net/wiki/index.php/Main_Page. (Accessed on 06/05/2020).

N. Jovanovic, C. Kruegel, and E. Kirda. 2006. Pixy: a static analysis tool for detecting Web application vulnerabilities. In 2006 IEEE Symposium on Security and Privacy (S P’06). 6 pp. – 263. https://doi.org/10.1109/SP.2006.29

George Kastrinis and Yannis Smaragdakis. 2013. Hybrid context-sensitivity for points-to analysis. ACM SIGPLAN Notices 48, 6 (2013), 423–434. https://doi.org/10.1145/2499370.2462191

R. Khatchadourian, Y. Tang, M. Bagherzadeh, and S. Ahmed. 2019. Safe Automated Refactoring for Intelligent Parallelization of Java 8 Streams. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). 619–630. https: //doi.org/10.1109/ICSE.2019.00072

Nikolaos Koutroumpouchos, Georgios Lavdanis, Eleni Veroni, Christoforos Ntantogian, and Christos Xenakis. 2019. ObjectMap: detecting insecure object deserialization. In Proceedings of the 23rd Pan-Hellenic Conference on Informatics. 67–72.

Sriteja Kummita, Goran Piskachev, Johannes Späth, and Eric Bodden. 2021. Qualitative and Quantitative Analysis of Callgraph Algorithms for Python. In 2021 International Conference on Code Quality (ICCQ). IEEE, 1–15. https://doi.org/ 10.1109/ICCQ51190.2021.9392986

Davy Landman, Alexander Serebrenik, and Jurgen J. Vinju. 2017. Challenges for Static Analysis of Java Re"ection: Literature Review and Empirical Study. In Proceedings of the 39th International Conference on Software Engineering (Buenos Aires, Argentina) (ICSE’17). IEEE, New York, NY, USA, 507–518. https://doi.org/10.1109/ICSE.2017.53

Ondřej Lhoták and Laurie Hendren. 2006. Context-sensitive points-to analysis: is it worth it?. In International Conference on Compiler Construction. Springer, 47–64. https://doi.org/10.1007/11688839_5

Yue Li, Tian Tan, Yulei Sui, and Jingling Xue. 2014. Self-Inferencing Re"ection Resolution for Java. In Proceedings of the 28th European Conference on ECOOP 2014 — Object-Oriented Programming - Volume 8586. Springer-Verlag, Berlin, Heidelberg, 27–53. https://doi.org/10.1007/978-3-662-44202-9_2

Yue Li, Tian Tan, and Jingling Xue. 2019. Understanding and analyzing Java reflection. ACM Transactions on Software Engineering and Methodology (TOSEM) 28, 2 (2019), 1–50. https://doi.org/10.1145/3295739

Liu Ping, Su Jin, and Yang Xinfeng. 2011. Research on software security vulnerability detection technology. In Proceedings of 2011 International Conference on Computer Science and Network Technology, Vol. 3. 1873–1876. https://doi.org/10.1109/ ICCSNT.2011.6182335

Alvaro Muñoz and Christian Schneider. 2018. Serial killer: Silently pwning your Java endpoints. http://www.slideshare.net/ cschneider4711/owasp-benelux-day-2016-serial-killer-silently-pwning-your-java-endpoints. (Accessed on 11/15/2019).

Gail C Murphy, David Notkin, William G Griswold, and Erica S Lan. 1998. An empirical study of static call graph extractors. ACM Transactions on Software Engineering and Methodology (TOSEM) 7, 2 (1998), 158–191.

Oracle. 2010. Java Object Serialization Speci!cation - version 6.0. https://docs.oracle.com/javase/8/docs/platform/ serialization/spec/serialTOC.html. (Accessed on 04/07/2020).

Or Peles and Roee Hay. 2015. One Class to Rule Them All: 0-Day Deserialization Vulnerabilities in Android. In 9th USENIX Workshop on Offensive Technologies (WOOT 15). USENIX Association, Washington, D.C., 12.

Shawn Rasheed and Jens Dietrich. 2020. A hybrid analysis to detect Java serialisation vulnerabilities. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering. 1209–1213.

Michael Reif. 2023. mreif/jcg - Docker Image | Docker Hub. https://hub.docker.com/r/mreif/jcg [Online; accessed 20. Oct. 2023].

Michael Reif, Florian Kübler, Michael Eichberg, Dominik Helm, and Mira Mezini. 2019. Judge: Identifying, Understanding, and Evaluating Sources of Unsoundness in Call Graphs. In Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis (Beijing, China) (ISSTA 2019). ACM, New York, NY, USA, 251–261. https://doi.org/10. 1145/3293882.3330555

Michael Reif, Florian Kübler, Michael Eichberg, and Mira Mezini. 2018. Systematic Evaluation of the Unsoundness of Call Graph Construction Algorithms for Java. In Companion Proceedings for the ISSTA/ECOOP 2018 Workshops (ISSTA’18). ACM, New York, NY, USA, 107–112. https://doi.org/10.1145/3236454.3236503

Henry Gordon Rice. 1953. Classes of recursively enumerable sets and their decision problems. Trans. Amer. Math. Soc. 74, 2 (1953), 358–366.

Barry K Rosen, Mark N Wegman, and F Kenneth Zadeck. 1988. Global value numbers and redundant computations. In Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages. 12–27.

Atanas Rountev, Ana Milanova, and Barbara G Ryder. 2001. Points-to analysis for Java using annotated constraints. ACM SIGPLAN Notices 36, 11 (2001), 43–55. https://doi.org/10.1145/504311.504286

Joanna C. S. Santos, Reese A. Jones, Chinomso Ashiogwu, and Mehdi Mirakhorli. 2021. Serialization-Aware Call Graph Construction. In Proceedings of the 10th ACM SIGPLAN International Workshop on the State of the Art in Program Analysis

Joanna C. S. Santos, Reese A. Jones, and Mehdi Mirakhorli. 2020. Salsa: Static Analysis of Serialization Features. In Proceedings of the 22th ACM SIGPLAN International Workshop on Formal Techniques for Java-Like Programs (FTfJP ’20) (Virtual) (FTfJP 2020). ACM, New York, NY, USA, 18–25. https://doi.org/10.1145/3427761.3428343

Imen Sayar, Alexandre Bartel, Eric Bodden, and Yves Le Traon. 2023. An in-depth study of java deserialization remote-code execution exploits and vulnerabilities. ACM Transactions on Software Engineering and Methodology 32, 1 (2023), 1–45.

Christian Schneider and Alvaro Muñoz. 2016. Java Deserialization Attacks. https://owasp.org/www-pdf-archive/GOD16- Deserialization.pdf. (Accessed on 11/15/2019).

Edward J Schwartz, Thanassis Avgerinos, and David Brumley. 2010. All you ever wanted to know about dynamic taint analysis and forward symbolic execution (but might have been afraid to ask). In 2010 IEEE symposium on Security and privacy. IEEE, 317–331.

Hossain Shahriar and Hisham Haddad. 2016. Object injection vulnerability discovery based on latent semantic indexing. In Proceedings of the 31st Annual ACM Symposium on Applied Computing. 801–807.

M. Sharp and A. Rountev. 2006. Static Analysis of Object References in RMI-Based Java Software. IEEE Transactions on Software Engineering 32, 9 (2006), 664–681. https://doi.org/10.1109/TSE.2006.93

Mikhail Shcherbakov and Musard Balliu. 2021. Serialdetector: Principled and practical exploration of object injection vulnerabilities for the web. In Network and Distributed Systems Security (NDSS) Symposium 202121-24 February 2021.

Yannis Smaragdakis, George Balatsouras, George Kastrinis, and Martin Bravenboer. 2015. More Sound Static Handling of Java Re"ection. In Programming Languages and Systems, Xinyu Feng and Sungwoo Park (Eds.). Springer International Publishing, Cham, 485–503. https://doi.org/10.1007/978-3-319-26529-2_26

Yannis Smaragdakis and George Kastrinis. 2018. Defensive Points-To Analysis: E#ective Soundness via Laziness. In 32nd European Conference on Object-Oriented Programming (ECOOP 2018). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik. https://doi.org/10.4230/LIPIcs.ECOOP.2018.23

NSA Center for Assured Software. 2023. Juliet Java 1.3. https://samate.nist.gov/SARD/test-suites/111 [Online; accessed 1. May. 2022].

Manu Sridharan, Shay Artzi, Marco Pistoia, Salvatore Guarnieri, Omer Tripp, and Ryan Berg. 2011. F4F: Taint Analysis of Framework-Based Web Applications. In Proceedings of the 2011 ACM International Conference on Object Oriented Programming Systems Languages and Applications (Portland, Oregon, USA) (OOPSLA ’11). ACM, New York, NY, USA, 1053–1068. https://doi.org/10.1145/2048066.2048145

Manu Sridharan, Satish Chandra, Julian Dolby, Stephen J. Fink, and Eran Yahav. 2013. Alias Analysis for Object-Oriented Programs. Springer Berlin Heidelberg, Berlin, Heidelberg, 196–232. https://doi.org/10.1007/978-3-642-36946-9_8

Li Sui, Jens Dietrich, Michael Emery, Shawn Rasheed, and Amjed Tahir. 2018. On the soundness of call graph construction in the presence of dynamic language features-a benchmark and tool evaluation. In Asian Symposium on Programming Languages and Systems. Springer, 69–88.

Li Sui, Jens Dietrich, Amjed Tahir, and George Fourtounis. 2020. On the Recall of Static Call Graph Construction in Practice. https://doi.org/10.1145/3377811.3380441

H. Thaller, L. Linsbauer, A. Egyed, and S. Fischer. 2020. Towards Fault Localization via Probabilistic Software Modeling. In 2020 IEEE Workshop on Validation, Analysis and Evolution of Software Tests (VST). 24–27. https://doi.org/10.1109/ VST50071.2020.9051635

Julian Thomé, Lwin Khin Shar, Domenico Bianculli, and Lionel C Briand. 2017. Joanaudit: A tool for auditing common injection vulnerabilities. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering. 1004–1008.

Frank Tip and Jens Palsberg. 2000. Scalable propagation-based call graph construction algorithms. In Proceedings of the 15th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications. 281–293.

Raja Vallée-Rai, Phong Co, Etienne Gagnon, Laurie Hendren, Patrick Lam, and Vijay Sundaresan. 1999. Soot - a Java Bytecode Optimization Framework. In Proceedings of the 1999 Conference of the Centre for Advanced Studies on Collaborative Research (Mississauga, Ontario, Canada) (CASCON ’99). IBM Press, 13.

Marvin Wyrich and Justus Bogner. 2019. Towards an Autonomous Bot for Automatic Source Code Refactoring. In Proceedings of the 1st International Workshop on Bots in Software Engineering (Montreal, Quebec, Canada) (BotSE ’19). IEEE Press, 24–28. https://doi.org/10.1109/BotSE.2019.00015