INTRODUCING

Code Review Guidelines Optimized
for Practical Use

Sider inspects source code using various analyzers and linters, and while there are various coding standards out there, many are redundant or not practical for the developers actually adopting them. Sider recognizes this issue, and has created ‘Recommended Rules’, an original code review guideline that aligns with the needs of developers.

Current availability of Recommend Rules

Recommended Rules are available in Sider for the following languages.

C++
Java
Python
php
JavaScript
TypeScript
Ruby

Rationale for Recommended Rules

Recommended Rules for each language was created by researching and analyzing highly active and popular repositories shared in the open-source software (OSS) community. Through this process, we were able to determine which coding guidelines were actually being used, and which of them were redundant.

The rules were created heuristically as follows:

  • Select a base coding guideline (e.g. in the case of Java, the base coding guideline could be Sun Coding Standard, Google Coding Standard, etc.).
  • Analyze 1,000 popular and active OSS repositories. By tracing back every git commit, we were able to reconstruct every coding guideline violation, determine whether they were dealt with, as well as how and when that occurred.
  • Apply machine learning (clustering) to establish which coding guidelines are ‘actually’ being followed by the development community of these 1,000 repositories.
  • The extracted collection of coding guidelines are then manually checked, after which they are made available as Recommended Rules.

REFERENCED OSS

Part of the 1,000 repository samples selected from the OSS community

Customization of Recommended Rules

Recommended Rules should be understood as "rules widely followed by developers of a specific programming language”. Compared to default settings of conventional static analyzers and linters, Recommended Rules is intended to be less ‘noisy’ with fewer false-positives. As a result of suppressing the ‘noise’, it may not detect what a conventional tool might expect.

For example, in a project (repository) where you want to clearly distinguish between "" (double quotation) and '' (single quotation), the Recommended Rules will not point this out.

In such a case, it is necessary to remove Recommended Rules and create your own custom rules and settings. Alternatively, by placing and modifying the Recommended Rules configuration file in the repository, new rules can be added/enabled for a better Sider experience.

Please refer to the following document for specific settings pertaining to each analyzer.

Rules can be added or removed as needed.

The above Recommended Rules were created by analyzing 1,000 OSS repositories as training data. Therefore, depending on the project, there still may be rules that you want to enable/disable. At Sider, we are currently researching new ways to automatically generate the best recommended rules for that repository. If you are interested in this, please let us know from the chat at the bottom right of the screen.

Collaborators

Recommended Rules is based on joint research with academic institutions and researchers.

Research papers, Presentations

Toru Kurashige, Kentaro Suetsugu, Koichiro Sumi, Masataka Nagura, Shingo Takada, Akihiro Asahara: An Investigation about Occurrences of Coding Style Violation for OSS, IPSJ/SIGSE Software Engineering Symposium (SES2020), (poster presentation), Sept. 2020

Masataka Nagura, Kensuke Taguchi, Shingo Takada: A Defect Prediction Method for Software Changes Based on Coding Style Violation Metrics, IPSJ Journal (in Japanese), Vol. 61, No. 4, pp.895-907, Apr. 2020

Co-Researchers

Prof. Shingo Takada
Takada Lab, Department of Information & Computer Science, Keio University, Japan.

Prof. Masataka Nagura
Department of Software Engineering, Faculty of Science and Technology, Nanzan University, Japan.

Try Sider for free

Free 14-day trial for private repositories, and forever free for open source.
Sign up now, no credit card required!