4D tells about the opportunities and challenges of edge computing

星际视界IPFSNEWS

特邀专栏作者

2022-01-05 13:01

This article is about 10002 words, reading the full article takes about 15 minutes

In this article, we will propose the concept of edge computing and analyze the prospect of edge computing.

Original title: Edge Computing: Vision and Challenges

Original Author: Professor Shi Weisong

The current wave of global digitalization is booming. Edge computing provides key capabilities such as computing, network, and intelligence nearby; accelerates the transformation and upgrading of the empowerment economy, and has gradually become a new direction of the computing system, a new format in the information field, and a new platform for industrial transformation. is in a stage of rapid development. The popularization of the concept of edge computing is accelerating towards practical deployment, which has attracted extensive attention from academia and industry.

Edge computing advocates processing data at the edge of the network, thereby reducing system response time, protecting data privacy and security, extending battery life, and saving network bandwidth.

secondary title

01 What is edge computing

As the amount of data generated at the edge of the network continues to increase, it will be more efficient to process data directly at the edge of the network. Concepts such as micro data center, micro cloud, and fog computing have been proposed before. In this section, we will explain what edge computing is and why edge computing is more effective than cloud computing for certain computing services.

Putting all computing tasks on the cloud is an effective way, because the computing power of the cloud is much stronger than that of the edge device. Although the data processing speed is fast, the network bandwidth is very limited. As data continues to increase, data transmission speed has become a bottleneck for improving cloud computing capabilities. For example, a Boeing 787 airliner generates 5G data every second, but the bandwidth between the aircraft and the satellite or base station cannot accommodate such a large amount of data transmission. A self-driving car can generate 1G of data per second, and it needs to process the data in real time and make correct actions. If all the data is sent to the cloud for processing, the response time will become very long, and supporting many cars in a certain area to work at the same time is also a huge challenge for the current network bandwidth and reliability. Therefore, this requires direct data processing on network edge devices.

Almost all electronic devices will be part of the Internet of Things, and they will play the role of data producers and consumers, such as air quality sensors, street lights, microwave ovens and so on. There are so many of these devices that they generate huge amounts of data, so traditional cloud computing methods will not be able to support such a huge amount of data. Therefore, the large amount of data generated by IoT devices cannot all be transmitted to the cloud, and they need to be processed directly at the edge of the network.

The diagram above shows the structure of traditional cloud computing. Data producers generate raw data and send it to the cloud, and data consumers send requests to the cloud and use the data. But this structure cannot meet the needs of the Internet of Things era. First, the amount of data generated by the device is too large, resulting in a lot of unnecessary bandwidth and resource consumption. Second, the need to protect privacy also hinders the application of cloud computing. Third, most IoT end nodes are devices with limited energy, which may be powered by batteries, and wireless communication modules are usually relatively energy-consuming, so it is very effective to directly perform some computing tasks on edge nodes.

In the cloud computing paradigm, end devices on the edge usually play the role of data consumers, such as watching YouTube videos on smartphones. But now people are also acting as data producers, using smartphones to take photos, take videos, and then share them on YouTube, Facebook, Twitter, etc. However, the data volume of these pictures and videos is too large, and if they are directly uploaded to the Internet, it will take up a lot of bandwidth. Therefore, pictures and videos can be adjusted directly on the terminal, and then uploaded to the cloud. Another example is wearable health devices. The data collected by these devices may be relatively private, so processing data directly on the device instead of uploading to the cloud will better protect data privacy.

Edge computing is an enabling technology that can compute uplink data for IoT services and downlink data for cloud services at the edge of the network. The "edge" here refers to any computing and network resources between the data source and the cloud data center. For example, a smartphone is the "edge" between individuals and the cloud, and a gateway in a smart home is the "edge" between home devices and the cloud. The basic principle of edge computing is to perform calculations close to the data source. From this point of view, edge computing is similar to fog computing, but edge computing focuses more on the "thing" side, while fog computing focuses more on infrastructure. We think edge computing will have as big an impact on our society as cloud computing.

Figure 2 shows the bidirectional computing flow in edge computing. In the edge computing paradigm, things are not just data consumers, but also data producers. At the edge of the network, things can not only request services and content from the cloud, but also perform computing tasks. The edge can store, cache and process data, while sending cloud services and requests to users. Therefore, it is necessary to properly design the edge of the network to meet the requirements of security, reliability and privacy protection.

secondary title

02Related cases

In the cloud computing paradigm, where most computation occurs in the cloud, this computing paradigm can lead to long system delays that degrade user experience. In edge computing, the edge has certain computing resources, which can help the cloud to share part of the computing tasks.

In a traditional content delivery network (CDN), only data is cached on edge servers. Because in the past few decades, content providers have provided data directly on the Internet. But in the IoT era, data is produced and consumed at the edge. Therefore, in edge computing, data and operations on data need to be cached on the edge.

One advantage of edge computing can be seen in online shopping services. Consumers may frequently operate the shopping cart. By default, the operations on the shopping cart will be completed on the cloud, and then the shopping cart interface on the client side will be updated. Depending on your internet speed and server load, this process can take a long time, and even longer for mobile devices. As more and more shopping is done on mobile clients, in order to improve user experience, the shopping cart update operation can be moved to edge nodes. As mentioned earlier, user shopping cart data and operations on shopping carts can be cached at edge nodes. Of course, the user's shopping cart data will eventually be synchronized to the cloud, but these can run in the background.

When a user moves from one edge node to another, this involves multi-node collaboration. We can simply cache the data to each edge node that the user arrives at, but the synchronization of each node needs to be further studied. For example: navigation applications in a small area can move navigation or search services to the edge; content filtering and integration can be performed on edge nodes to reduce the amount of data transmission; real-time applications such as AR can use edge nodes to reduce response time. Therefore, using edge computing can reduce system latency and greatly improve user experience.

The ubiquity of mobile phones and webcams has made video analytics an emerging technology. Cloud computing is not suitable for video analysis because of the long delay in data transmission and privacy concerns. Here we mention an example of searching for missing children. Now there are a lot of cameras in the city, when a child is lost, he/she is likely to be captured by a camera. However, due to privacy issues and transmission costs, these camera data are usually not all transmitted to the cloud, so it is difficult for us to utilize such a wide range of cameras.

Even with access to this data from the cloud, transferring and searching such a large amount of data can be time-consuming, which can be intolerable for missing children. We can leverage the edge computing paradigm to send missing child search requests from the cloud to devices in the target area. Each device in a specific area, such as a smartphone, will search in the local camera data, and then only return the search results, so that the search time will be greatly reduced.

The Internet of Things has greatly improved the home environment. Some related products have appeared on the market, such as smart lights, smart TVs, sweeping robots and so on. But just connecting devices to the cloud through wireless communication modules such as Wi-Fi is far from being a smart home. In a smart home, in addition to connected devices, a large number of sensors and controllers should be deployed in rooms, pipes, floors, walls, etc. They will generate a large amount of data, but considering privacy issues and transmission pressure, most of these data need to be used directly locally. This makes cloud computing no longer suitable for smart homes, and will be replaced by edge computing. By running the edge operating system (EdgeOS) on the home gateway, home devices can connect to the gateway, and then deploy related services for unified management.

Figure 3 shows an EdgeOS structure in a smart home. EdgeOS can collect various data in the house through Wi-FI, Bluetooth, ZigBee, cellular network, etc. Different data sources will be fused at the data abstraction layer. Above the data abstraction layer is the service management layer. This layer needs to support Differentiation, Extensibility, Isolation, and Reliability.

The edge computing paradigm can be applied in smart homes, communities and even in cities. The main reasons are as follows:

1. Large amount of data: According to relevant data, a city with a permanent population of one million will generate 180PB of data every day, and these data come from public safety, medical care, transportation, etc. It is unrealistic to build a centralized cloud data center to process these data. Edge computing is an effective solution.

2. Low latency: For those applications that require deterministic and low latency, such as medical equipment or public safety equipment, edge computing is also a suitable paradigm, which can save transmission time and simplify the network structure. Compared with cloud processing, data processing at the edge will make decision-making more efficient.

3. Location awareness: For geographic location-based applications such as transportation facility management, edge computing can obtain more accurate location information. Data can be collected and processed based on location without being sent to the cloud.

In industry and academia, the cloud can be said to be the standard computing platform for big data processing. Cloud computing requires data to be transferred to the cloud for processing, but in many cases, few stakeholders who own the data are willing to share it due to privacy concerns and data transfer costs, so opportunities for multiple stakeholders to collaborate are limited. The edge, as a small data center, connects the cloud and end users. A collaborative edge connects several different edges. This House-like connection allows different stakeholders to collaborate and share data.

A very valuable application in the near future is connected healthcare applications, as shown in Figure 4. For example, if there is an outbreak of influenza, patients flow to the hospital, and the patient's electronic medical records will be updated at the same time. Hospitals count and share information about flu outbreaks, such as average treatment costs, symptoms, number of sick people, and more. Theoretically, the patient will go to the pharmacy to get the medicine according to the prescription, but it is also possible that the patient does not follow the doctor's advice for treatment, but the hospital does not know that the patient did not take the medicine, so the hospital has to bear the responsibility for retreatment. Now through the Collaborative Edge, pharmacies can provide hospitals with records of patient purchases, bringing clarity to medical accountability.

At the same time, the pharmacy uses the collaborative edge to obtain the number of patients from the hospital, so that the pharmacy can stock up in advance and thus obtain more profits. In addition, the pharmacy can also obtain the price, location, and inventory of the drug from the pharmaceutical company. The pharmacy can also obtain the delivery price of the logistics company, so as to formulate a more appropriate medication plan. Pharmaceutical companies can formulate reasonable production plans based on the medication data sent by pharmacies. At the same time, the government Center for Disease Control and Prevention can also issue warnings to people in a specific area by detecting the number of sick people and take corresponding measures to curb the spread of influenza.

secondary title

03 Opportunities and challenges

In the previous section, we introduced several application examples of edge computing. In this section we summarize the challenges of edge computing and propose some solutions worthy of further research. Mainly related to programmability, naming, data abstraction, service management, privacy and security, and optimization metrics.

1. Programmability

In cloud computing, users write programs and deploy them in the cloud. Cloud providers are responsible for deciding where to perform computing tasks. Users don't know how the application runs, which is also an advantage of cloud computing, and the infrastructure structure of cloud computing is transparent to users. Usually, because the program only runs on the cloud, it is completed by a programming language and compiled to a specific target platform to run. But in edge computing, computing tasks are distributed to edge nodes on various platforms. The running time of different nodes is different, and program developers are faced with huge difficulties.

In order to solve the programmability problem of edge computing, we propose the concept of computing flow. It refers to a series of operations on data on the data transmission path. These actions can include all or part of the app's functionality. Computing flow is a software-defined computing process that can process data in a distributed and efficient manner on data generation devices, edge nodes, and cloud environments.

As edge computing is defined, computing should be done at the edge rather than in the cloud. In this case, computational flow can help users decide which operations need to be performed and how data should be propagated. Metrics for where to perform operations can be latency, energy consumption, hardware and software constraints, and so on. By deploying computing streams, we believe that data computing should be as close to the data source as possible, thereby reducing data transmission costs. In the computing flow, operations can be redistributed, and the corresponding data and state must be redistributed. In addition, we have to solve collaboration problems, such as data synchronization.

2. Naming

In edge computing, an important assumption is that the volume of things is very large. There are many applications running on the edge nodes, and each application has its own service organization structure. Like all computer systems, in edge computing, nomenclature is important for programming, addressing, object recognition, and data communication. However, at this stage, an efficient and standardized naming mechanism has not been determined for the edge computing paradigm. In order to communicate with various heterogeneous systems, edge developers need to learn a variety of network communication protocols. The naming principle of edge computing needs to address the mobility of objects, the high variability of network topology, privacy and security protection, and scalability for a large number of uncertain objects.

Traditional naming mechanisms such as DNS and URI can meet most current network requirements. But they are not flexible enough to serve dynamic edge networks. Because devices on the edge are highly mobile and have limited resources. For those devices with limited resources, IP-based naming principles cannot be supported.

New naming mechanisms such as Named Data Network (NDN) and MobilityFirst can be applied in edge computing. NDN provides a hierarchical naming structure, which has good scalability, good readability, and facilitates service management. However, in order to be suitable for other communication protocols such as Bluetooth, Zigbee, etc., additional agents need to be added to it. Another issue with NDN is security, as it is difficult to separate device hardware information from the service provider. In order to provide better mobile support, MobilityFirst can separate the name from the network address, but requires the use of a globally unique identifier (GUID). Another disadvantage of MobilityFirst is that it is not convenient for service management, because the GUID is not very readable.

For a relatively small fixed edge, such as a home environment, EdgeOS can assign a network address to each device. In a system, each device has a unique human-readable name, which describes the following information: location, role, data description. For example "Kitchen.Microwave.Temperature".

As shown in Figure 5, EdgeOS will assign the corresponding identifier and address to it. Each object has a unique human-readable name, which facilitates service management, object identification, and parts replacement. This naming mechanism is very convenient for both users and service providers. For example, the user will receive information from EdgeOS such as "the light on the bedroom ceiling is broken", so that the user can directly replace the light bulb without having to find an error code or reconfigure the network address for the light bulb. This naming mechanism provides service providers with better programmability, shields hardware information, and better protects data privacy and security. Unique identifiers and network addresses can be mapped one-to-one with human-readable names. EdgeOS uses identifiers for object management. Network addresses such as IP addresses or MAC addresses can be used to support various communication protocols, such as Bluetooth, WIFI, Zigbee, etc.

3. Data abstraction

Various applications run on EdgeOS, and each application provides specific services through the service management layer API. The problem of data abstraction has been intensively studied in wireless sensor networks and cloud computing paradigms. But in edge computing, this problem becomes more challenging. In the era of the Internet of Things, there are a large number of data-generating devices in the network. Here we take the smart home as an example. In the smart home environment, almost all devices will send data to EdgeOS. But most devices on the edge of the network only periodically send data to the gateway. For example, a thermometer sends data every minute, but this data is only used by real users a few times in a day. Another example is home security cameras. It records data and sends it to the gateway at any time, but this data stays in the database for a while, is not used by anyone, and is eventually replaced by new data.

Based on the above, we believe that human intervention should be reduced as much as possible in edge computing, and edge nodes should consume/process all data and interact with users in a proactive manner. In this case, the gateway needs to preprocess the data, such as noise removal, event detection, and privacy protection. The processed data will be sent to the upper layer to provide appropriate services. This process faces several challenges.

First, as shown in Figure 6, the data formats transmitted from different devices are different. Considering privacy and security issues, the application on the gateway should not get the raw data, it only needs to get the content of interest from the full data table. The format of the data table can be ID, time, name, data (such as 0000,12:34:56pm 01/01/2022, kitchen.oven2.temperature3, 78). But the sensor data is hidden, so data availability may be affected. Second, it is sometimes difficult to decide the level of data abstraction. Some apps or services may not get enough information if too much raw data is filtered out. But data storage can also be cumbersome if you keep too much raw data. Sometimes due to low sensor accuracy, unstable environment or abnormal communication, the data information on the edge device may be unreliable, so how to abstract useful information from unreliable data is also a challenge.

Collecting data is for applications. In order to complete specific services, applications need to control objects, such as reading and writing data. The data abstraction layer combines data presentation methods and corresponding operations, and provides a common interface. In addition, due to the diversity of devices, data presentation methods and corresponding operations are different, so it is not easy to find a general data abstraction method.

4. Service Management

For service management on the edge of the network, in order to ensure system stability, we believe that it needs to have the following characteristics: distinguishability, scalability, isolation, and reliability.

Differentiability: With the rapid development of IoT, multiple services will be deployed on the edge of the network. Different services should have different priorities, and key services such as object judgment and failure alarm should be executed before other common services. For health-related services, heartbeat detection should have the highest priority.

Scalability: Scalability is a big challenge for the network edge. Devices in IoT are more dynamic than mobile systems. Whether the new equipment purchased by the user can be connected to the original system will be a problem to be solved first. These problems can be solved by designing a flexible and scalable service management layer.

Isolation: Isolation is another issue that needs to be addressed at the edge of the network. On mobile systems, if the application crashes, the entire system will restart. In distributed systems, shared resources can be managed through different synchronization mechanisms such as locks or token rings. But in EdgeOS, the problem is more complicated.

Multiple applications will share the same resources, such as the control of lights. If the app crashes or becomes unresponsive, the user should still be able to control the lights without breaking the entire EdgeOS. After the user removes the app that controls the light from the system, the light still needs to remain connected to EdgeOS. We can solve this problem by deploying/undeploying the framework. If a conflict is detected before the app is installed, a warning is sent to the user to avoid potential access issues. Another issue is how to isolate user personal data from third-party applications. For example, your activity tracking app cannot access your battery usage data. In order to solve this problem, we can add an access control mechanism in the service management layer of EdgeOS.

Reliability: Reliability is also an important issue. From a service perspective, it is sometimes difficult to identify exactly why a service failed. For example, if an air conditioner fails, possible causes include a power outage, a compressor failure, or even a dead thermostat battery. Sensor nodes may lose connection to the system due to depleted batteries, poor connection conditions, or worn components. It would be nice if EdgeOS could alert the user which part is unresponsive, or warn the user in advance which part of the system is at risk of damage. From a system point of view, it is very important to maintain the type of network topology of the entire system, and each component in the system can send status/diagnostic information to EdgeOS. This makes it easy to deploy services such as error detection, device replacement, and data quality inspection.

From the data point of view, the challenges to reliability mainly come from the sensor data and communication part. As previously researched and discussed, the edge of the network can fail for various reasons, sending unreliable data. We also mentioned many novel communication protocols for IoT data collection, which can support a large number of sensor nodes and dynamic network conditions. But their connection reliability is not as good as Bluetooth or WIFI. Providing reliable service will be a challenge if data and communications are unreliable.

5. Privacy and Security

At the edge of the network, data privacy and security protection is an important service. If IoT applications are deployed in homes, a large amount of private data of users will be collected. For example, we can judge whether there are people in the home by reading the electricity and water usage data. So how to provide services without involving privacy is also a problem. Some private information can be removed before processing the data, such as masking faces in videos. We think that computing on edge data sources, that is, at home, may be a good way to protect privacy and data security.

We want to raise awareness about data privacy and security. Taking WIFI networks as an example, among the 439 million home network connections, 49% of WIFI networks are not secure, and 80% of home routers use default passwords. 89% of public Wi-Fi hotspots are not secure. All stakeholders including service providers, system and application developers, and end users need to be aware that user privacy may be violated. Cameras, health monitors and even WIFI toys could be connected by others if left unprotected.

The second issue to mention is data ownership. In mobile applications, end-user data is stored and analyzed by the service provider. But letting the data stay where it was generated, and letting the user own the data can better protect privacy. Similar to health data, user data collected at the edge should be kept at the edge and it is up to the user to decide whether to provide it to the service provider.

The third problem is that there are too few effective tools to protect privacy and data security at the edge of the network. Some devices have limited resources, and some current security protection methods cannot be deployed on them. Moreover, the environment at the edge of the network is changeable, so it is vulnerable to attacks and difficult to protect. In order to protect privacy, some platforms such as mHealth have proposed unified health data storage standards. But for edge computing, there is a lack of tools for processing all kinds of data.

6. Optimizing indicators

In edge computing, there are multiple layers with computing power. So how should the workload be distributed? We can consider the following distribution strategies, such as distributing the load evenly in each layer or completing as many tasks as possible in each layer. The extreme cases are operating entirely on the endpoint or entirely in the cloud. In order to choose the best allocation strategy, in this section we discuss several optimization metrics, including latency, bandwidth, energy consumption, and cost.

Latency: Latency is one of the most important metrics for measuring performance, especially in interactive applications or services. Servers in cloud computing can provide powerful computing capabilities. They can handle very complex tasks such as image processing, speech recognition and many more in a short time. But latency is not determined by computation time. Long network latencies can have a profound impact on the behavior of real-time/interactive applications. To reduce latency, it's best to do work at the closest physical layer.

For example, in the smart city case, we can use the mobile phone to process local photos first, and then only need to send information about missing children to the cloud, instead of uploading all photos, so this method is faster. But working on the nearest physical layer isn't always the best way. We need to consider resource usage and avoid unnecessary waiting time so that we can build an optimal logic hierarchy. For example, when the user is playing a game, since the computing resources of the mobile phone have been occupied, it is best to transfer the photo to the nearest gateway or micro-center for processing.

Bandwidth: From a latency perspective, high bandwidth reduces transfer times. For short-distance transmission, we can build high bandwidth to send data to the edge. On the one hand, if data can be processed at the edge, system latency will be greatly reduced, and bandwidth at the edge and the cloud can also be saved. For example, in the smart home case, through WIFI or other high-speed transmission methods, almost all data can be processed at the gateway. In addition, the reliability of the transmission is also improved, because the transmission distance is relatively short. On the other hand, although the edge cannot do all the work and reduce the transmission distance, it can at least significantly reduce the amount of uploaded data by preprocessing the data.

Power Consumption: Batteries are the most precious resource for devices at the edge of the network. For the endpoint layer, doing some of the work at the edge saves energy. But the key is to make a trade-off between computing energy consumption and transmission energy consumption. In general, we first consider the energy consumption characteristics of the workload. Is the amount of calculation large? How many resources will be used? In addition to network signal strength, data size and available bandwidth affect transmission energy consumption. If the transmission overhead is less than the local computing overhead, it is better to use edge computing.

But if we focus on the entire edge computing process, not just the endpoints, then the total energy consumption should be the sum of the energy consumption of each layer. Similar to the endpoint layer, the energy consumption of each layer includes local computing energy consumption and transmission energy consumption. Thus the optimal work allocation strategy may change. For example, if the local data center is busy, the work should be uploaded to the upper layer for completion. Compared with computing at the endpoint, multi-hop transmission will significantly increase the system overhead and thus increase the energy consumption.

Cost: From the perspective of service providers, such as YouTube, Amazon, etc., edge computing provides them with less latency and energy consumption, thereby increasing data throughput and improving user experience. Therefore, they can make more profit while processing the same workload. For example, based on the interests of the majority of residents, we can place a certain popular video at the edge of the building layer, so that the edge of the city layer can handle more complex tasks and the overall data throughput can be improved. The service provider's input is the cost of creating and maintaining each layer. In order to make full use of local data at each layer, providers can charge users according to the location of the data, and new cost models need to be developed to ensure service provider profits and user acceptability.

secondary title

04 Summary

When someone first talked about starting a business on the Internet, people thought it was a joke. For edge computing, there were many doubts at the beginning. The factory floor can be seen as the most critical area of IT development, and the same transformation will happen in industry as it did in telecommunications 25 years ago. A general-purpose computer in virtual mode, a virtual computer, will change all that. It happened because the economic value was so attractive that it was irresistible.

Now, because processing data at the edge can guarantee shorter response time and better reliability, more and more services are moved from the cloud to the edge of the network. If you process a large amount of data at the edge, you can avoid sending the data to the cloud. Thus saving bandwidth. The popularity of the Internet of Things and mobile devices has changed the role of the edge in the computing paradigm. The edge is changing from a pure data consumer to a data producer and consumer. It is more efficient to process data at the edge of the network.

Re-summarize the value of edge computing: As an extension of cloud computing, edge computing extends the service capabilities of cloud computing to the edge closer to users, thereby helping applications provide a lower-latency business experience. We hope that edge computing will make data processing more efficient and life better in the future.

The article only represents the author's point of view and does not constitute any investment advice

Privacy Computing

Welcome to Join Odaily Official Community

Subscription Group

https://t.me/Odaily_News

Chat Group

https://t.me/Odaily_CryptoPunk

Official Account

https://twitter.com/OdailyChina

Chat Group

https://t.me/Odaily_CryptoPunk