Verification Playground is a series of program verification competitions that emphasize the human aspect of program analysis. During these events, participating teams are given a set of verification challenges that they have to solve on-site during the time they have available using their preferred verification tools.
During each competition, participants have to implement the given algorithm in the input language of their tool of choice, formalize the specification, and formally prove the correctness of their implementation against the specification. These challenges often feature a number of properties that are beyond the capabilities of fully automatic verification and require the human expertise of experts in the field to suitably encode programs, specifications, and invariants.
This report analyzes how the participating teams fared on these challenges, reflects on what makes a verification challenge more or less suitable for the typical VerifyThis participant, and outlines the difficulties of comparing the work of teams using wildly different 검증놀이터 approaches in a competition focused on the human aspect.
The most interesting observation is that more successful teams tended to be those using tools specialized in the kinds of properties and programs they mainly target. This is likely because a tool that targets certain kinds of algorithms and programs has a higher likelihood of being more successful in proving their correctness against a challenge involving a property or program that is outside its sphere of interest.
On the other hand, less successful teams tend to be those that use tools with built-in support for a particular style of concurrency. This is likely because a verification challenge involving concurrent programming and reasoning is harder to design, and therefore is more difficult to verify.
Another important observation is that the later a challenge problem appears in a competition, the fewer teams manage to solve it correctly. This is a natural consequence of competitions being run over several hours and the fact that participants get tired after a long period of effort.
As a result, some of the easier challenges seem to appear in later editions, while the hardest ones are more likely to be presented at earlier ones (this is probably also true of the most popular verification tool: it is relatively simple to verify an algorithm that uses a stack data structure). We are interested in determining whether this is the case and, if so, how we can better predict future editions to make them as accessible as possible.
A possible solution to this problem is to provide a range of challenges, each of which fits the needs of a different kind of participant. This would encourage more varied approaches to verification and help to promote eclectic thinking.
In addition, we suggest that organizers of future events should consider adding a new award category to reward the teams that have displayed the widest variety of approaches during the competition. This could help encourage a more diverse participation and may even contribute to the development of an ecosystem of complementary approaches and verification frameworks.
0 comments:
Post a Comment