Guest post by Dominic Pritham, R&D Manager, Cooper Lighting Solutions
Sensors are an integral part of a lighting system. While light level changes are what the user experiences, sensors are responsible for driving these changes based on how they have been commissioned. A very common use case is when one walks into an empty space, the occupancy PIR (passive infra-red) sensor senses heat and actuates the lights. Another commonly used sensor is the daylight sensor, which senses ambient light resulting in light level changes based on commissioning configuration. As IoT continues to bring together various pieces of technology, a rich feature set has evolved beyond just lighting. We now live and work in spaces that deliver security, entertainment, therapy, lighting and horticulture, all of which can be connected via an IoT system to deliver a holistic solution for ease of deployment, maintenance and upgrades.
This involves a lot of different sensors, various commissioning techniques, maybe different vendors even. One sensor that has been growing in popularity is the camera. Cameras are powerful multifaceted devices that bring a lot of capabilities to a lighting system. They can do occupancy and with the right choice and configuration, they could also do daylighting.
Cameras can recognize individual persons. Thereby allowing different customizations of the same space depending on who is occupying it. They can detect objects in real time and alert the authorities if needed, they can convert text in images to characters a computer can store, they can count the number of objects of interest, because of spatiotemporal information, they can infer from context and finally, troubleshooting could be done in the form of a video.
While cameras can deliver on all of this, the research and development cost are high because of the complexity. Furthermore, storage cost and network complexity exponentially grow due to these high bandwidth sensors.
The reason for complexity is due to the way images are represented in computers. What humans see are shown in the image below:
How this information is represented in a computer is shown below:
As you can see, images are two-dimensional matrix with each pixel ranging from 0-255. That is a lot of nonlinear information. And hence, programming a computer to operate on this data in a heuristic manner can be very complicated and results vary based on the image.
The major areas of interest in what cameras can do in an IoT system can be categorized into the following:
1. Classification
2. Detection
3. Tracking
Classification: Predicting the correct label for a given input image.
Detection: Determining where in each image an object of interested is located, most commonly by use of a bounding box.
Tracking: Tracking the object of interest across multiple frames.
Below is an example of what it takes to train a machine learning model to accomplish classification:
Here you see a set of images and their respective class or labels. This is called a training dataset. It is important to note that the cleaner your training dataset is, the better your model will perform. The training dataset should be a representation of what you expect the model to see once deployed. For instance, if your IoT system is going to have camera installed in the ceiling, prioritize collecting data with a top view. Each the classes or labels are basically folders inside of which are the images. Typically, the folder name is the class name. It is also customary to have a train and a test folder. The test folder is used test out the model after training. So, make sure your test folder’s dataset is also a representation of what your model will be seeing in deployment.
Here is another example of image detection training phase:
There are several annotation tools. A quick search will return various options. Whichever one you decide to use, the concept is the same. Annotate the object of interest. In this example, the object of interest is this Ironman mug. Not any mug. It is specifically an Ironman mug. Draw the bounding box as tight as possible.
Once you have annotated it, start the training phase. This training phase is split into two, namely feature extraction and classification. There are Deep Learning ways to achieve training as well.
As you can see from the deployment, the background and surroundings have changed, but our detection is accurate.
The research and development time for machine learning can be long and results may not be 100%. There can be constant improvements made to model to either add new classes or improve on existing accuracy. There is certainly a future for cameras to be integrated in lighting IoT systems to bring about a rich feature set with advanced connectivity and inference. There could be a seamless transition of information between smart cities, buildings and homes by leveraging cameras to improve security, accessibility and personalization.
Applications of integrating camera with lighting infrastructure are amber alert (OCR, which is optical character recognition), home security, parking assistant, hospice care, personalized lighting, people counting and contact tracing to name a few.
Computer Vision and Machine Learning work best when there is a very well-defined objective. Data collected for training should represent what the model is expected to see in deployment. As technology improves, developing these models will be faster, more efficient and accurate.
Leave a Reply