IoTPy: Python + Streams + Agents for Streaming Applications

Ingest and analyze streams of data generated by sensors, social media an other sources.

Kanianthra Chandy

Distributed Systems Internet of Things (IoT) Messaging and Job Queues (RabbitMQ/Redis/...) Scientific Libraries (Numpy/Pandas/SciKit/...) Sensors

See in schedule

Sensors, social media, news feeds, webcams and other sources generate streams of data which are analyzed to control actuators, generate alerts, and feed displays. These applications process streams on onboard computers, such as the Raspberry Pi, connected directly to sensors, and send summarized information to the cloud for further processing. These applications have two characteristics: (1) Concurrency: The applications are concurrent using multiple threads to connect to sensors and actuators, shared memory across multiple processes on multicore machines and message passing for distributed systems spanning multiple computers. (2) Data Analysis: The applications use programs from a variety of libraries including those for signal processing, machine learning and natural language processing.

Developers of streaming applications can use open-source software to deal with both characteristics. Concurrency: multiprocessing.Array can be used to construct shared-memory multiprocessing Python programs in multicore computers, and frameworks such as APMQ and Kafka can be used to build distributed applications. Data Analysis: A vast collection of open-source Python libraries can be used to analyze data in streams. Developers of streaming applications encounter an impedance mismatch between the software libraries that address these two characteristics. The next paragraph describes the mismatch and how IoTPy addresses it.

Programs in most software libraries apply a function to data, get results, and terminate execution. By contrast, streaming applications are perpetual processes that analyze endless streams of data. IoTPy helps developers: (1) build non-terminating streaming applications by harnessing conventional terminating programs from Python’s huge base of libraries and (2) create multithreaded, multicore and distributed Python applications by simply connecting streams to each other.

Type: Talk (30 mins); Python level: Beginner; Domain level: Beginner

Kanianthra Chandy

California Institute of Technology

K Mani Chandy received his B.Tech degree in 1965 from IIT Madras, Masters in Electrical Engineering in 1966 from New York University, and his PhD from the Massachusetts Institute of Technology in 1969 in Operations Research. He worked at Honeywell and IBM and taught in the Computer Science Department at the University of Texas at Austin from 1970 to 1987 where he was the Regents Chair Professor and also served as the chair of the CS department. He was at the Computer Science Department at the California Institute of Technology from 1987 to 2014 where he was the Simon Ramo Chair, and is now an Emeritus Professor. He received the A. A. Michelson Award in 1985 and the IEEE Koji Kobayashi award in 1995 for his contributions to computer performance modeling. He was awarded the John Sherman Fairchild Scholarship in 1987. He received the ACM SIGOPS Hall of Fame award and the ACM Edsger W. Dijkstra prize, with Leslie Lamport, for their paper on distributed snapshots. He received the IEEE Harry H Goode Award in 2017 with Jayadev Misra for contributions to distributed computing. He became an IEEE Fellow in 1990, was inducted into the United States National Academy of Engineering in 1995 and became an ACM Fellow and received a distinguished Alumnus Award from IIT-Madras in 2019. He received teaching awards at the University of Texas and at Caltech and has graduated over 30 PhD students. Chandy has written four books and published widely-cited papers on queuing theory and the performance analysis of computing and communication systems; formal reasoning about concurrent computing systems; programming languages for parallel computing; complex event processing for detecting threats such as earthquakes; and mathematical models of electrical power systems. He is developing a framework, IoTPy, to help students and novice programmers build applications for processing streams of data from sensors and other sources.