Jobs

Study of artificial intelligence algorithms with information theory via the information-bottleneck framework 

Posted on the 27th November 2023

Internship proposal (6 months) 

Référence    VCL112023   

Internship supervisor 

 Mitsubishi Electric R&D Centre Europe: Vincent Corlay, researcher, v.corlay@fr.merce.mee.com 

 Overall context 

 Mitsubishi Electric is an international company with a broad activity in electrical and electronic systems, and in particular in telecommunication, robots, factory automation equipment, on-board devices for cars. Mitsubishi Electric R&D Centre Europe (MERCE), whose French branch located in Rennes, is a research laboratory thinking up new generations of communication systems, reliable software methodologies, future technologies in the field of power electronic systems. 

In recent years, a large deployment of artificial intelligence (AI) [1] in all engineering and non-engineering areas has been observed. For instance, AI and machine learning (ML)-based methods have attracted great research interest in physical layer design problems of the wireless communication domain. Thanks to AI/ML algorithms, a communication system that is modelled as a chain of multiple independent blocks with different functionalities may be replaced by a more compact non-modular system model with better end-to-end performance. However, a satisfactory understanding of AI models is far from being attained. For example, Transformers at the heart of the GPT engine [2] were well tested experimentally but no theoretical study is available to give a mathematical insight explaining the performance of transformers. 

This internship proposes to analyse and optimize artificial intelligence algorithms with information theory tools. It is of particular interest for MERCE as it is common to several domains of application and can be applied for improving the computational efficiency of AI techniques in current 5G and future 6G Core Networks. 

Intership subject 

 The objective of this internship is to survey state-of-the-art tools enabling the theoretical analysis of complex neural networks. Our tools are linear algebra, functional analysis, measure theory, and information theory. Indeed, Information Theory could bring new methods not available in other scientific fields. Tishby suggested the use of the Information Bottleneck (IB) concept to understand deep neural networks [3] [4]. However, the IB principle raised a controversy in the deep learning community [5]. Other information theoretic approaches do exist to analyse and understand deep neural networks. For example, [6] considers the last hidden layer (the feature function) of a deep NN (DNN) and studies the feature extraction process in the DNN. A score parameter, called H-score, is introduced to measure the effectiveness of a model. Concretely, the main goal of the intern will be to try to reproduce the results of Tishby [3][4] to gain a better understanding of the issues highlighted in [6]. If time allows it, simulations with the H-score may also be considered. 

 

 Detailed objectives  

. Understanding the Information Bottleneck framework [3][4]. 

. Reproduce the results of Tishby [3][4] to gain a better understanding of the issues highlighted in [6]. 

. Writing the internship report. 

 

 Prerequisites 

The candidate must be open to multidisciplinary work that mixes information theory, machine learning, advanced mathematics, and software development. 

• Capacity to develop in Python or Matlab 

• An interest in research (e.g., through a future PhD thesis) 

• Autonomy 

• Good skills in English (read and written) 

• Good literature search skills 

Duration: 6 months 

Period: from Feb/March 2024 (possibility of flexibility, depending on schools’ internships periods)  

Contact

Magali BRANCHEREAU (jobs@fr.merce.mee.com)
Vincent Corlay (v.corlay@fr.merce.mee.com) 

Thank you to provide us an application letter and your CV mentioning the reference of the internship. The signature of an Internship Agreement with your school is mandatory. 

Références

[1] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. The MIT Press, 2016. 

[2] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, L. Kaiser, I. Polosukhin, “Attention Is All You Need,” Advances in Neural Information Processing Systems 30 (NIPS 2017), Long Beach, CA, 2017. 

[3] R. Schwartz-Ziv and N. Tishby, “Opening the black box of deep neural networks via information,” arXiv:1703.00810 [cs.LG], downloadable at https://arxiv.org/abs/1703.00810, April 2017. 

[4] A.A. Alemi, I. Fischer, J.V. Dillon, and K. Murphy, “Deep variational information bottleneck,” International Conference on Learning Representations, ICLR 2017, Toulon, April 2017. 

[5] A.M. Saxe, Y. Bansal, J. Dapello, M. Advani, A. Kolchinsky, B.D. Tracey, and D.D. Cox, “On the information bottleneck theory of deep learning,” J. Stat. Mech. Theory Exp. 2019, 2019, 124020. 

[6] X. Xu, S.-L. Huang, L. Zheng, and G.W. Wornell, “An Information Theoretic Interpretation to Deep Neural Networks,” Entropy, vol. 24, pp. 1-28, Jan. 2022.