Preview

Economics of Contemporary Russia

Advanced search

Innovative Approach to Information Search by Example of a Patent Analysis of an Important Substitution Plan

https://doi.org/10.33293/1609-1442-2020-1(88)-143-157

Abstract

Nowadays the process of information accumulation is so rapid that the concept of the usual iterative search requires revision. Being in the world of oversaturated information in order to comprehensively cover and analyze the problem under study, it is necessary to make high demands on the search methods. An innovative approach to search should flexibly take into account the large amount of already accumulated knowledge and a priori requirements for results. The results, in turn, should immediately provide a roadmap of the direction being studied with the possibility of as much detail as possible. The approach to search based on topic modeling, the so-called topic search, allows you to take into account all these requirements and thereby streamline the nature of working with information, increase the efficiency of knowledge production, avoid cognitive biases in the perception of information, which is important both on micro and macro level. In order to demonstrate an example of applying topic search, the article considers the task of analyzing an import substitution program based on patent data. The program includes plans for 22 industries and contains more than 1,500 products and technologies for the proposed import substitution. The use of patent search based on topic modeling allows to search immediately by the blocks of a priori information – terms of industrial plans for import substitution and at the output get a selection of relevant documents for each of the industries. This approach allows not only to provide a comprehensive picture of the effectiveness of the program as a whole, but also to visually obtain more detailed information about which groups of products and technologies have been patented.

About the Author

Maria A. Milkova
Central Economics and Mathematics Institute of the Russian Academy of Sciences, Moscow
Russian Federation


References

1. Jerivanceva T.N. (2017). Assessment of the competitiveness of Russian scientific and technological backlogs in the field of creating medical instruments. Ekonomika Nauki, no. 1, pp. 53–69 (in Russian).

2. Andrejchikov A.V., Teveleva O.V., Nevolin I.V., Milkova M. A., Kravchuk I.S. (2019). Methodology for conducting search research to identify opportunities for import substitution of high-tech products based on world patent and financial information resources. Ekonomika i Predprinimatel'stvo, no. 4,

3. Janina A.O., Voroncov K.V. (2016). Multimodal topic models for exploratory search in a collective blog. Mashinnoe Obuchenie i Analiz Dannyh, vol. 2, no. 2, pp. 173–186 (in Russian).

4. pp. 157–167 (in Russian).

5. Apishev M., Koltcov S., Koltsova O., Nikolenko S., Vorontsov K. (2016). Mining ethnic content online with additively regularized topic models. Computación y Sistemas, vol. 20, no. 3, pp. 387–403.

6. Gibson Je., Dajm T., Garses Je., Dabich M. (2018). Bibliometric analysis as a tool for identifying common and emerging methods of technological Foresight. Forsajt, vol. 12, no. 1, pp. 6–24 (in Russian).

7. Blei D., Ng A., Jordan M. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, no. 3.

8. Devjatkin D.A., Smirnov I.V., Sochenkov I.V., Tihomirov I.A. (2016). Modern methods of computer linguistics for patent search and analysis. Intellektual'naja Sobstvennost'. Promyshlennaja Sobstvennost'. Special'nyj Vypusk, no. 1, pp. 71–77 (in Russian).

9. Boyd-Graber J., Hu Y., Mimmo D. (2017). Applications of topic models. Foundations and Trends in Information Retrieval, pp. 1–154.

10. Milkova M. A. (2019). Topic models as a tool for distance reading. Cifrovaja Ekonomika, no. 1 (5), pp. 57–69 (in Russian).

11. Chen L., Shang W., Yang G., Zhang J., Lei X. (2016). A topic model integrating patent classification information for patent analysis. Geomatics and Information Science of Wuhan University, vol. 41, pp. 123–126.

12. Milovidov V. (2019). Hearing the sound of the wave: What makes it difficult to anticipate innovation? Forsajt, vol. 12, no. 1, pp. 88–97 (in Russian).

13. Choi D., Song B. (2018). Exploring technological trends in logistics: Topic modeling-based patent analysis. Sustainability, no. 10 (8), pp. 2810.

14. Strel’cova E.A., Fursov K.S., Chulok A.A. (2016). Analysis of patent information as a tool for identifying and evaluating the technological profile of a country. Intellektual'naja Sobstvennost'. Promyshlennaja Sobstvennost'. Special'nyj vypusk, no. 1, pp. 63–70 (in Russian).

15. Daud A., Li J., Zhu L., Muhammad F. (2009). A generalized topic modeling approach for maven search. In: Li Q., Feng L., Pei J., Wang S.X., Zhou X., Zhu QM. (eds.) Advances in data and web management. APWeb 2009. WAIM 2009. Lecture Notes in Computer Science, vol 5446. Berlin, Heidelberg: Springer.

16. Tihonov A.N., Arsenin V.Ya. (1986). Metody resheniya nekorrektnyh zadach. Moscow, Nauka, 287 p. (in Russian).

17. Shvab K. (2016). The fourth Industrial Revolution. Moscow, Jeksmo, p. 208 (in Russian).

18. Eisenstein J., Chau D.H., Kittur A., Xing E.P. (2012). TopicViz: Interactive topic exploration in document collections. Proceeding of CHI EA '12. Extended Abstracts on Human Factors in Computing Systems, pp. 2177–2182.

19. Jerivanceva T. N. (2016). The use of patent analysis to assess the prospects of import substitution on the example of domestic retractors and crosslinking products. Ekonomika Nauki, no. 4, pp. 261–275 (in Russian).

20. Frei O., Apishev M. (2016). Parallel non-blocking deterministic algorithm for online topic modeling. In: Ignatov D. et al. (eds) Analysis of Images, Social Networks and Texts. AIST 2016. Communications in Computer and Information Science, vol. 661, Springer, Cham.

21. Jerivanceva T.N. (2017). Assessment of the competitiveness of Russian scientific and technological backlogs in the field of creating medical instruments. Ekonomika Nauki, no. 1, pp. 53–69 (in Russian).

22. Grant C.E., Clint P.G., Virupaksha K., Nirkhiwale S., Wilson J.N., Wang D.Z. (2015). A topic-based search, visualization, and exploration system. Proceedings of the Twenty-Eighth International Florida Artificial Intelligence Research Society Conference, pp. 43–48.

23. Janina A.O., Voroncov K.V. (2016). Multimodal topic models for exploratory search in a collective blog. Mashinnoe Obuchenie i Analiz Dannyh, vol. 2, no. 2, pp. 173–186 (in Russian).

24. Halibas A.S., Shaffi A.S., Mohamed M.A. (2018). Application of text classification and clustering of Twitter data for business analytics. Majan International Conference (MIC). Muscat, pp. 1–7.

25. Apishev M., Koltcov S., Koltsova O., Nikolenko S., Vorontsov K. (2016). Mining ethnic content online with additively regularized topic models. Computación y Sistemas, vol. 20, no. 3, pp. 387–403.

26. Helbing D. (2019). Towards digital enlightenment: Essays on the dark and light sides of the digital revolution. Springer, Cham.

27. Blei D., Ng A., Jordan M. (2003). Latent dirichlet allocation. Journal of Machine Learning Research, no. 3.

28. Hofmann T. (1999). Probabilistic latent semantic analysis. Uncertainty in Artificial Intelligence. Stockholm, UAI'99.

29. Boyd-Graber J., Hu Y., Mimmo D. (2017). Applications of topic models. Foundations and Trends in Information Retrieval, pp. 1–154.

30. Kahneman D., Frederick S. (2002). Representativeness revisited: Attribute substitution in intuitive judgment. In: T. Gilovich, D. Griffin, D. Kahneman (eds.). Heuristics and biases. New York, Cambridge University Press, pp. 49–81.

31. Chen L., Shang W., Yang G., Zhang J., Lei X. (2016). A topic model integrating patent classification information for patent analysis. Geomatics and Information Science of Wuhan University, vol. 41, pp. 123–126.

32. Kahneman D. (2003). A perspective on judgment and choice: Mapping bounded rationality. American Psychologist, no. 58 (9), pp. 697–720.

33. Choi D., Song B. (2018). Exploring technological trends in logistics: Topic modeling-based patent analysis. Sustainability, no. 10 (8), pp. 2810.

34. Krishna A., Aich A., Akhilesh V., Hegde C. (2018). Analysis of customer opinion using machine learning and NLP techniques. International Journal of Advanced Studies of Scientific Research, vol. 3(9).

35. Daud A., Li J., Zhu L., Muhammad F. (2009). A generalized topic modeling approach for maven search. In: Li Q., Feng L., Pei J., Wang S.X., Zhou X., Zhu QM. (eds.) Advances in data and web management. APWeb 2009. WAIM 2009. Lecture Notes in Computer Science, vol 5446. Berlin, Heidelberg: Springer.

36. Sulea O.-M., Zampieri M., Malmasi S., Vela M., Dinu L.P., Genabith J. (2017). Exploring the use of text classification in the legal domain. Proceedings of the 2nd Workshop on Automated Semantic Analysis of Information in Legal Texts (ASAIL).

37. Eisenstein J., Chau D.H., Kittur A., Xing E.P. (2012). TopicViz: Interactive topic exploration in document collections. Proceeding of CHI EA '12. Extended Abstracts on Human Factors in Computing Systems, pp. 2177–2182.

38. Suominen A., Toivanen H., Seppänen M. (2017). Firms' knowledge profiles: Mapping patent data with unsupervised learning. Technological Forecasting and Social Change, vol. 115, pp. 131–142.

39. Frei O., Apishev M. (2016). Parallel non-blocking deterministic algorithm for online topic modeling. In: Ignatov D. et al. (eds) Analysis of Images, Social Networks and Texts. AIST 2016. Communications in Computer and Information Science, vol. 661, Springer, Cham.

40. Tang J., Wang B., Yang Y., Hu P., Zhao Y., Yan X., Gao B., Huang M., Xu P., Li W., Usadi A.K. (2012). PatentMiner: Topic-driven patent analysis and mining. KDD’12. August 12–16. 2012. Beijing, pp. 1366–1374.

41. Grant C.E., Clint P.G., Virupaksha K., Nirkhiwale S., Wilson J.N., Wang D.Z. (2015). A topic-based search, visualization, and exploration system. Proceedings of the Twenty-Eighth International Florida Artificial Intelligence Research Society Conference, pp. 43–48.

42. Tseng Y.-H., Lin C.-J. (2007). Text mining techniques for patent analysis. Information Processing & Management, no. 43, pp. 1216–1247.

43. Halibas A.S., Shaffi A.S., Mohamed M.A. (2018). Application of text classification and clustering of Twitter data for business analytics. Majan International Conference (MIC). Muscat, pp. 1–7.

44. Vorontsov K.V., Potapenko A.A. (2014). Additive regularization of topic models. Machine Learning Journal, Special Issue «Data Analysis and Intelligent Optimization». Springer, pp. 1–21.

45. Helbing D. (2019). Towards digital enlightenment: Essays on the dark and light sides of the digital revolution. Springer, Cham.

46. Vorontsov K., Frei O., Apishev M., Romov P., Suvorova M. (2015). Bigartm: Open source library for regularized multimodal topic modeling of large collections. AIST'2015, Analysis of Images, Social networks and Texts. Springer International Publishing Switzerland, Communications in Computer and Information Science (CCIS), pp. 370–384.

47. Hofmann T. (1999). Probabilistic latent semantic analysis. Uncertainty in Artificial Intelligence. Stockholm, UAI'99.

48. Kahneman D., Frederick S. (2002). Representativeness revisited: Attribute substitution in intuitive judgment. In: T. Gilovich, D. Griffin, D. Kahneman (eds.). Heuristics and biases. New York, Cambridge University Press, pp. 49–81.

49. Kahneman D. (2003). A perspective on judgment and choice: Mapping bounded rationality. American Psychologist, no. 58 (9), pp. 697–720.

50. Krishna A., Aich A., Akhilesh V., Hegde C. (2018). Analysis of customer opinion using machine learning and NLP techniques. International Journal of Advanced Studies of Scientific Research, vol. 3(9).

51. Sulea O.-M., Zampieri M., Malmasi S., Vela M., Dinu L.P., Genabith J. (2017). Exploring the use of text classification in the legal domain. Proceedings of the 2nd Workshop on Automated Semantic Analysis of Information in Legal Texts (ASAIL).

52. Suominen A., Toivanen H., Seppänen M. (2017). Firms' knowledge profiles: Mapping patent data with unsupervised learning. Technological Forecasting and Social Change, vol. 115, pp. 131–142.

53. Tang J., Wang B., Yang Y., Hu P., Zhao Y., Yan X., Gao B., Huang M., Xu P., Li W., Usadi A.K. (2012). PatentMiner: Topic-driven patent analysis and mining. KDD’12. August 12–16. 2012. Beijing, pp. 1366–1374.

54. Tseng Y.-H., Lin C.-J. (2007). Text mining techniques for patent analysis. Information Processing & Management, no. 43, pp. 1216–1247.

55. Vorontsov K.V., Potapenko A.A. (2014). Additive regularization of topic models. Machine Learning Journal, Special Issue «Data Analysis and Intelligent Optimization». Springer, pp. 1–21.

56. Vorontsov K., Frei O., Apishev M., Romov P., Suvorova M. (2015). Bigartm: Open source library for regularized multimodal topic modeling of large collections. AIST'2015, Analysis of Images, Social networks and Texts. Springer International Publishing Switzerland, Communications in Computer and Information Science (CCIS), pp. 370–384.


Review

For citations:


Milkova M.A. Innovative Approach to Information Search by Example of a Patent Analysis of an Important Substitution Plan. Economics of Contemporary Russia. 2020;(1):143-157. (In Russ.) https://doi.org/10.33293/1609-1442-2020-1(88)-143-157

Views: 878


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 1609-1442 (Print)
ISSN 2618-8996 (Online)