Hideki Takayasu, Hideto Kamei, Misako Takayasu

Construction of mathematical models of bankruptcy of business firms from the big data is a very important problem both practically and academically. About 50 years ago Altman started data analysis of bankruptcy and established the basic 3 steps [1]: Preparation of variables, Selection of variables, and Modeling. We follow this basic steps using a huge database of business firms provided by a Japanese credit research company, Teikoku Databank. There are about 10,000 bankruptcy events among 1 million firms in Japan every year for these 20 years, and we also analyze business firms transaction network data including about 4 million business relations.
For preparation of variables we pick up about 100 variables in the financial reports, and also we add new variables describing network information such as link numbers. For selection of variables, we firstly make a one-body ranking list in view of strength of correlation with the bankruptcy events, and we check spurious correlations by introducing conditional probability for two-body and reduce the number of variables. Then, we combine the chosen variables for modeling, in which we select the best functional forms of selected variables to maximize the resulting predictability of the model. This methodology is applicable not only for bankruptcy but also to various complex phenomena in which cause-and-result relations are not known but needed to be estimated from the data.

[1] E. Altman, Journal of Finance, 23 (1968) 589.
[2] Hideto Kamei, Hideki Takayasu, Yoshiyuki Kabashima and Misako Takayasu,
Proceedings of the Asia-Pacific Econophysics Conference 2016-Big Data Analysis and Modeling toward Super Smart Society (APEC-SSS2016), Japan Physical Society Conference Proceedings, to appear.