Named Data Networking (NDN)  is proposed as an entirely new network architecture for future Internet, in which packets carry data names rather than IP addresses. In NDN, all communications are driven by the receiving end, i.e., the consumers , through the exchange of two distinct types of packets: Interest and Data. Both types of packets carry a name that identifies a piece of data that can be transmitted in one Data packet. To fetch desired content, a consumer sends out an Interest packet with a unique identifying name to the network. Routers use this name to forward the Interest towards the producer(s) . On the forwarding path, once the Interest reaches a node that has the requested Data, i.e., the Interest name is the same with the Data name or a prefix of the Data name , the Data containing both the name and the content will follow the reversed path taken by the Interest to get back to the consumer .
For the name-based packet forwarding, the core component is the stateful forwarding plane , where three tables including Content Store, Pending Interest Table (PIT) and Forwarding Information Base (FIB) are deployed . Content Store is a temporary cache of Data packets the router has received to answer re-requested Interest packets. PIT stores all the Interests that a router has forwarded but not satisfied yet where each entry records a name carried in an Interest packet, together with its incoming and outgoing interfaces. FIB stores a set of name prefixes of Interest packets announced in routing and its outgoing interfaces as the next hop to ensure proper packet forwarding.
|Dataset||# of Names||layer-1||layer-2||layer-3||layer-4||layer-5||layer-6|
Given the forwarding plane in NDN is so different from IP, it has to imply a substantial re-engineering lookup structures for fast, memory-efficient, and scalable packet forwarding. So far, many novel indexes based on trie , hash table , Bloom filter  and skip list  are proposed for NDN forwarding plane to support efficient name lookup. As the large scale real NDN names are not available yet, most of these indexes are optimized for or evaluated with URLs collected from current networks. Unfortunately, the performance of indexes heavily relies on the distributions of data indexed. The distributions of URLs indexed in memory may differ from those of names independently generated by content-centric applications online, which will impact the performance of existing indexes designed using URLs. Taking hash table with CityHash256 as an example, the distributions of two datasets that are URLs-like names extracted from Blacklist  in 2013 and 2020, respectively, are listed in Table I. The distributions of the two datasets in memory are different in the number of conflicts in per layer, which impacts the lookup speed and memory consumption of hash table seriously. As shown in Fig. 1, with the number of data indexed increases, the gap between the lookup time on the two datasets in different year also increases. Without loss of generality, for looking up real names with different distributions in the future and ensure the performance of existing proposed indexes for NDN forwarding plane, they have to be redesigned to adapt to the distributions of the real NDN names, which will cost lots of engineering efforts.
To tackle this gap, a smart mapping model based on neural networks, called Pyramid-NN, is proposed to construct a mapping function that can adapt to the distributions of name indexed by learning the distributions in the static memory. Moreover, based on Pyramid-NN, an index called LNI is proposed, which can not only achieve the efficient performance, but also ensure it without redesign simply by simply retraining with the future real names to adapt to the distributions of them. The main contributions are as follows:
In order to build an efficient index that can deal with the change of data distributions without redesign, a smart pyramid-like neural network model named Pyramid-NN is first conceived. Pyramid-NN learns the distributions of data indexed in the static memory by training, so that it can not only adapt to the distributions of data indexed, but also map the data indexed more uniformly, which improves the memory utilization. The architecture of Pyramid-NN is designed to be a multi-level model consisting of a number of Back Propagation Neural Networks (BPNNs) 
, and the level-by-level training algorithm of it is put forward to support efficient model training. Moreover, the proper hyperparameters of it are selected by simulations and analysis.
Based on Pyramid-NN, an index called Learning Name Index (LNI) for NDN forwarding plane is proposed to support the efficient name lookup. In LNI, the Input Processor turns variable-length NDN name to fixed-dimensional input vector; Pyramid-NN does the mapping, which can improve the memory utilization, as well as adapt to complex NDN names as it can collect training data and select label rules flexibly; the Enhanced Bitmap dynamically to get the memory address for storing data, which further reduces the memory consumption. Moreover, the lookup algorithm of LNI is also proposed to implement fast name lookup.
The performance of Pyramid-NN is presented in detail to verify the feasibility for the mapping. As Pyramid-NN can learn the distributions of data indexed in the static memory, compared with the traditional hash functions with the probability of false positive under 1%, Pyramid-NN requires only about 25% of the slots and the execution speed is on the same order of magnitude as that of the CityHash256.
The performance of LNI-based FIB, called LNI-FIB, is evaluated and discussed by executing contrast experiments with three state-of-the-art indexes, namely Hash Table-FIB, Binary Patricia Trie-FIB and B-MaFIB. The results show that LNI-FIB extremely reduces the memory consumption to 10.258-, 22.258-, 36.258- and 58.258-MB for 0.5-, 1-, 1.5- and 2-million names with the probability of false positive under 1%, which means it can easily fit into contemporary SRAMs in commercial line card. Also, the throughput of LNI-FIB is about 177 million searches per second (MSPS), which well meets the current network requirement for fast packet processing.
The remainder of this paper is organized as follows. Section II surveys related work. Section III provides the design essentials of NDN name lookup. Section IV presents Pyramid-NN, including the design overview, the model architecture, the training process and the model hyperparameter selection. Section V describes LNI, presenting its architecture and the process of lookup. Section VI shows the performance of Pyramid-NN in detail. Section VII compares and discusses the performance of LNI-FIB by executing contrast experiments. Section VIII gives a brief conclusion and future work.
Ii Related Work
In this section, the indexes proposed in NDN forwarding plane are summarized 
Ii-a Trie-based Schemes
The logical characteristics of trie can reduce the memory consumption of the hierarchical names stored in NDN router, so [13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31] propose trie-based schemes for NDN forwarding plane. Among them, the main research issues are how to design its granularity to reduce the memory consumption and how to reduce its depth to improve the lookup speed. For example, NameTrie  proposes minASCII encoding to store and index forwarding information more efficiently; CONSERT  removes the redundancy to minimize the number of name prefixes; PC-NPT  proposes path compression to reduce the average number of node accesses; Binary Patricia Trie  uses binary as the granularity to minimize the impact of redundant information at memory; CTrie  builds a combinational trie structure from both component-based and byte-based hierarchical names to achieve the unified index.
Ii-B Hash Table-based Schemes
Hash table has advantages in lookup speed, so [18, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41] propose hash table-based schemes for NDN forwarding plane. Among them, the main research issues are how to ensure more accurate forwarding, how to reduce the memory consumption and how to support the algorithms of name matching. For example, FHT  uses fingerprint collision table to reduce the memory consumption; Binary Search of Hash Tables  constructs a balanced binary search tree to improve the execution efficiency of name matching algorithm; CoDE  achieves fast name lookup and update using conflict-driven encoding; MOBS  concentrates on the optimization of random search algorithm to reduce the memory consumption. However, the proposed schemes have to store all the content names additionally to ensure accurate forwarding, which causes memory inefficient.
Ii-C Bloom Filter-based Schemes
Bloom filter can greatly reduce the memory consumption, so [42, 43, 44, 45, 46, 12, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57] propose Bloom filter-based schemes for NDN forwarding plane. Among them, the main research issue is how to solve the problem that Bloom filter can only determine whether an element is in the set but not locate its memory address. For example, NameFilter  and DiPIT  assign a Bloom filter to each interface of the forwarding plane; BFAST  combines Bloom filter with hash table; MBF [45, 46] and B-MBF  combines Bloom filter with Mapping Array and Bitmap to locate the memory address.
Ii-D Skip List-based Schemes
Skip list can preserve the order of data storage and effectively support the cache replacement policy, so  and  propose skip list-based schemes for NDN forwarding plane. For example, Locality-Aware Skip List  records the address of skip list node accessed when querying a node, which improves the lookup speed to some extent. However, the time complexity of such schemes is generally high due to the limitation of its basic structure .
Ii-E Machine learning-based Schemes
In view of the rapid development of machine learning techniques in recent years, machine learning-based schemes have been proposed in indexes for NDN forwarding plane. Learning Tree  was posted to learn the distribution of data to build an efficient index. Learned Bloom-Filter Lookup combines Recursive Neural Network (RNN) with standard Bloom filter to improve lookup efficiency.
The indexes based on traditional data structure mentioned above have shown encouraging performance. They are optimized for or evaluated with URLs collected from the Internet, as the large-scale NDN names are not available yet. However, the distributions of real NDN names may differ from that of URLs used currently. As they cannot adapt to the distributions, they have to be redesigned to ensure the performance and the engineering effort is usually too high. Fortunately, machine learning opens up the opportunity to tackle this issue. Through utilizing neural network to learning the distributions of names to construct a mapping function, the index just needs to be retrained by using real names without redesign, which will not only save the engineering cost but also can achieve efficient performance.
Iii Design Essentials of NDN Name Lookup
In this section, by summarizing the existing theoretical results about the NDN name lookup, three design essentials of NDN name lookup are described in detail respectively.
Iii-a Complex Name Structure
Unlike IP addresses of fixed length, NDN names are variable-length with no upper bound, having complex and unrestrained formats. And what’s worse, so far the design tussle of data naming remains an open challenge due to the different requirements that applications, security, and the network place on data names. .
As lookup keys in NDN forwarding plane, the complex NDN names have to be scanned in the forwarding processes. Consequently, NDN forwarding plane has to support effective lookup for arbitrary complex names.
Iii-B Small Memory Footprint
Compared with IP, NDN forwarding plane calls for much more memory space for two reasons. First, the number of entries in NDN forwarding plane is orders-of-magnitude greater than that in IP. Taking PIT as an example, since an Interest stays in PIT of each NDN router along the path until the corresponding Data returns, PIT needs to hold 1 million entries for 10 Gbps gateway trace and 1.5 million entries for 20 Gbps at least . Second, the size of each entry in NDN forwarding plane is also larger, for an NDN name is more complex than an IP address. These factors together result in the forwarding tables with larger memory footprint than IP-forwarding tables. Therefore, it is really a great challenge to study how to reduce the size of three tables in NDN forwarding plane, so that they can be deployed in small and high-speed memories (e.g., SRAM).
Iii-C High Throughput
NDN forwarding plane requires quite frequent name lookup and update operations . Whenever Interest/Data(s) arrive at NDN routers or the routing protocols recompute FIB, the corresponding names have to be scanned and corresponding operations have to be performed in NDN forwarding plane. As reported in , considering a load equal to 100%, PIT’s operations peak at 60 million per second; in a more realistic scenario with flow balance or a load of 50%, the frequency of PIT’s operations is about 6 million per second. Therefore, NDN forwarding plane has to perform name lookup at a practicable high speed, so that it can satisfy the requirement for high throughput in NDN router.
Iv Smart Mapping Model: Pyramid-NN
Learning techniques are introduced to design a smart mapping model via neural networks, called Pyramid-NN. In this section, the design overview, the model architecture, the training process and the model hyperparameter selection are described in detail.
Iv-a Design Overview
In order to build an index that can deal with the change of data distributions without redesign, Pyramid-NN uses neural networks to learn the distributions of the data indexed in the static memory. The distributions of the data indexed are reflected in its cumulative distribution function (CDF), whose value represents the likelihood of a key less than or equals to the lookup key. The property of the CDF states that for a data set with arbitrary distribution the values calculated by its CDF have a uniform distribution on [0, 1]. Utilizing the CDF as a mapping function, the probability of each key mapped to different slots is the same. Therefore, the slots mapped can be uniformly distributed through multiplying the values of the CDF by the total number of slots in memory.
Specifically, the design overview of Pyramid-NN is illustrated in Fig. 2. The first phase is to construct the training set. A large number of variable-length content names are collected and turned into fixed-dimensional vectors. As lookup keys in NDN route table, these vectors are sorted based on the vector values, then labeled with the ordinals. The second phase is to train neural networks using the vector-ordinal pairs to learn the CDF of the data indexed. The final phase comes into application, which is to do the mapping through trained neural networks. In practical use, the Pyramid-NN is trained with actual content names and deployed in content router. The names of NDN packets are input into the trained neural networks and then the CDF values are estimated. Finally, the mapped slots are obtained by multiplying the values of the CDF by the total number of slots in memory, which can be distributed more uniformly.
Iv-B Model Architecture and Training Process
Lookup speed is the most critical requirement for the index. In order to ensure lookup speed, BPNNs with simple structure, fast execution speed, and strong nonlinear mapping ability, are used to build the model. However, as shown in Fig. 4(a), a single BPNN (i.e., the number of model level is 1) is difficult to accurately learn the CDF of millions of data. But when there are 2 or 3 levels, Pyramid-NN can get pretty good accuracy performance, since it can efficiently divide the large namespace into multiple smaller sub-namespaces so that each BPNN at the last level can accurately represent the CDF of relatively little data. Thus, Pyramid-NN is designed to be a multi-level neural network model. Meanwhile, the multi-level model can process data in parallel to further improve the lookup speed.
Fig. 3 gives an example of two-level Pyramid-NN, which consists of 1 BPNN at level 1 and 1,000 BPNNs at level 2. Suppose BPNN represents the k-th BPNN at level j. BPNN is trained to get the region number from 0 to 999, each of which corresponds to a BPNN (0 k 999). Each BPNN is trained to learn a part of the CDF. Therefore, the estimation range of all the trained BPNN can cover the entire CDF, i.e., the trained Pyramid-NN can be seen as a function that estimates the CDF value.
|Dataset||# of Names||Average Length (B)||Size (MB)|
|Test Set 1||500,000||25.895||13.133|
|Test Set 2||1,000,000||25.864||26.235|
|Test Set 3||1,500,000||25.872||39.364|
|Test Set 4||2,000,000||25.887||52.514|
Algorithm 1 shows the training process of Pyramid-NN. First, input vectors and labels are obtained based on all the training content names. For each i [0, Size - 1] where Size represents the number of training names, the i-th name is processed to the corresponding input vector. Considering the processing speed, the input process uses a bitxor method to achieve faster processing speed than that of hash functions. Suppose the input vector is y with dimension N, and an n-length NDN name can be seen as a vector x , where x is the ASCII value of the i-th character. For a name with length n N, set y = x where 0 i n and y = 0 where n i N. For n N, split every N elements of x into a sub-vector, then set y = Bitxor(the i-th elements in all sub-vectors) where 0 i N. Thus the N-dimensional input vector y is got (line 4); i / Size 1000 is calculated as the i-th label of level 1, which is obviously an integer in [0, 999]; i / Size is calculated as the i-th label of level 2, which is a decimal in [0, 1]. Then all the input vectors are sorted in ascending order to correspond to the labels (line 5-8). Second, BPNN is trained with all the input vectors and labels of level 1 (line 9-15). Third, all of the BPNNs are trained based on the outputs of level 1. For each k [0, 999], all the input vectors with output k at level 1 and their corresponding labels of level 2 are picked out as the input vectors and labels of BPNN respectively, and then BPNN is trained (line 16-26). For each BPNN
, the time complexity of training is O(n) with n is the epoches of training.
Iv-C Model hyperparameter Selection
It is generally known that the hyperparameter selection is important when designing a neural network. In Pyramid-NN, the number of model levels and the number of neurons for one BPNN directly affect not only the classification accuracy(i.e., the proportion of the names whose corresponding outputs and labels are matched in all input names), but also the memory consumption and the execution time. Furthermore, the number of input neurons for one BPNN also affects the probability of input collision(i.e., the proportion of distinct names processed to the same input vector in all names). Therefore, to select the proper hyperparameters for Pyramid-NN, Pyramid-NN under different hyperparameters is trained with MATLAB. The performance is tested on a workstation with an Intel Xeon E5-1650 v2 CPU of 3.50 GHz and DDR3 SDRAM of 24 GB. Given that more than 1 million entries have to be stored in NDN forwarding plane, billions of different NDN names, based on the naming conventions, are generated as the dataset of experiments, where 100 million are used as the training set, 2 million as the validation set and 0.5-, 1-, 1.5-, 2-million as the test sets, as listed in Table II.
Note that the hyperparameters selected below only represent one possible solution for implementing Pyramid-NN, but does not mean that Pyramid-NN must use the below hyperparameters.
Iv-C1 Number of Model Levels
The classification accuracy of multi-level model shown in Fig. 4(a) is relatively high. Fig. 4(b) shows the memory consumption of Pyramid-NN and the time consumption per million executions, where the 2-level Pyramid-NN requires less memory and executes relatively fast compared with the 3-level Pyramid-NN, so we determine Pyramid-NN to be a two-level model.
Iv-C2 Number of BPNNs at Level 2
As the number of entries in NDN forwarding plane is in the order of millions, 1,000 BPNNs are implemented at level 2, where each BPNN maps thousands of data to achieve more uniform mapping.
Iv-C3 Number of Input Neurons for one BPNN
As illustrated in Fig. 5(a), with the number of input neurons increases, both of input collision probability and classification accuracy decrease. According to the algorithm of input process introduced in subsection IV(B), with the increase of the number of input neurons, the probability of distinct names processed to the same input vectors (i.e., the input collision probability) reduces, but the model will be overfitted, which leads to the decrease of classification accuracy.
Specifically, when the number of input neurons is 3 to 6, the classification accuracy is greater than 80%. However, as it is 5 and 6, the input collision probability can be much less than the other two cases, which meets the current network requirements for 1% of packet loss rate well . Therefore, the number of input neurons can be 5 or 6. Moreover, considering the memory consumption and execution time shown in Fig. 5(b), Pyramid-NN with 5 input neurons requires less memory and shorter execution time relatively compared with the 6-input neurons Pyramid-NN. Thus, the number of input neurons is determined to be 5.
Iv-C4 Number of Hidden Neurons for one BPNN
Fig. 6(a) shows that when the number of hidden neurons is greater than or equal to 20, the classification accuracy is stabilized at about 85%. Further considering the memory consumption and execution time as shown in Fig. 6(b), with the number of hidden neurons increases, Pyramid-NN requires more the memory consumption and longer execution time. Hence the number of hidden neurons is determined to be 20.
V Learning Name Index: LNI
Based on Pyramid-NN, an index is proposed for NDN forwarding plane, called LNI, which not only can support efficient NDN name lookup, but also . In this section, the index architecture and lookup process of LNI are presented in detail.
V-a Index Architecture
The index architecture of LNI is shown in Fig. 7, which contains three units: the Input Processor, Pyramid-NN and the Enhanced Bitmap.
The Input Processor turns variable-length NDN name to fixed-dimensional input vector to support efficient lookup for complex names as described in Subsection IV(B).
Pyramid-NN does the mapping. According to the hyperparameter selection in Subsection IV(C), Pyramid-NN is a two-level model consisting of 1 BPNN at level 1 and 1,000 BPNNs at level 2, where each BPNN has 5 input neurons, 20 hidden neurons and 1 output neuron. Through the trained BPNNs, the mapped slot is got by multiplying the value of the CDF by the total number of slots.
The Enhanced Bitmap dynamically gets the memory address for storing data, which further reduces the memory consumption . Specifically, the Enhanced Bitmap is composed of slots with 2 bytes each, and evenly segmented into M equal parts corresponding to M dynamic memory spaces. From the output of Pyramid-NN, the mapped slot in the Enhanced Bitmap is got and the part that the slot is in is calculated. Then, in accordance with the order in which the name is inserted in this part, the corresponding sequence number is assigned and recorded in this slot, serving as the offset address. Based on the base address of memory space corresponding to this part and the offset address represented by the sequence number in this slot, the actual memory address for storing forwarding information can be obtained.
Obviously, the classification accuracy of the BPNN in Pyramid-NN and the number of slots in the Enhanced Bitmap jointly affect the false positive probability of LNI. For instance, if the name that is input into BPNN is mapped to a fault BPNN at level 2, this name will be mapped to the same slot in the Enhanced Bitmap occupied by the name classified correctly(i.e., the false positive), that is, the probability of false positive increases as the classification accuracy reduces. For the number of slots in the Enhanced Bitmap, the larger the number of slots, the more dispersed the data is in the Enhanced Bitmap, thus the false positive probability is reduced. With the classification accuracy of the BPNN in Pyramid-NN stabilized at about 85% (as shown in Fig. 6(a)), the number of slots in the Enhanced Bitmap will be appropriately enlarged to reduce the false positive probability to less than 1% which is the current network requirements for packet loss rate.
V-B Lookup Process
The lookup process of LNI is described in Algorithm 2. When a NDN name x is input, it is first split, and performed by Bitxor operation to get the 5-dimensional input vector (line 2-3). Then it is input into the BPNN for calculation (line 4). Based on L1_Output (i.e., the output of BPNN), BPNN is picked and calculated (line 5), and the mapped slot is equal to L2_Output (i.e., the output of BPNN) multiplied by the total number of slots in the Enhanced Bitmap (line 6). Finally, the corresponding slot in the Enhanced Bitmap is queried. If it is not empty, the actual memory address is obtained by adding the offset address recorded in the slot to the base address of memory space corresponding to its part (line 7-11).
An example of name lookup through LNI is indicated by arrow lines in Fig. 7. For an input NDN name /NDN/TJU/maps, the ASCII values of every characters form the vector x. Then x is split every 5 values into sub-vectors (i.e., sub-vector 1 to sub-vector 3), and the corresponding elements in all sub-vectors do the bitxor to obtain the the 5-dimensional input vector (26, 116, 98, 97, 66). Afterwards it is input into Pyramid-NN. In Pyramid-NN, suppose the region number calculated by BPNN is 998, so BPNN is picked next. Calculated by BPNN, the CDF value is got, supposed to be 0.8. Therefore, the predicted slot in the Enhanced Bitmap is equal to 0.8 multiplied by the total number of slots, namely 0.8 15 = 12. Finally, the slot 12 in the Enhanced Bitmap is queried. The slot 12 is in the 2nd part and the offset address recorded in the slot 12 is 3, thus the actual memory address is equal to the base address of memory space corresponding to the 2nd part plus the offset address 3.
Vi Performance of Pyramid-NN
In this section, Pyramid-NN is compared with some popular hash functions such as MD5, SHA1  and CityHash256 , as the trained Pyramid-NN is analogous to a hash function. The performance analysis is carried out in four aspects including the memory utilization, the probability of false positive, the model size and the execution speed, which concern if Pyramid-NN is acceptable in practice.
The experimental setup and the dataset are the same as described in Subsection IV(C). Pyramid-NN is implemented to be a two-level model consisting of 1 BPNN at level 1 and 1,000 BPNNs at level 2, where each BPNN has 5 input neurons, 20 hidden neurons and 1 output neuron. After training, all the weights and biases are extracted from MATLAB and then Pyramid-NN is regenerated in C++ based on the model specification. The hash functions are also implemented in C++ for fair comparison.
Vi-a Memory Utilization
The distribution of mapped slots in memory is first tested when the load factor is 1 (i.e., 0.5 million names are mapped to 0.5 million slots). As shown in Fig. 8(a), the distribution of mapped slots in memory for the hash functions is nonuniform with a large number of conflicts. Hence lots of long chains (the worst case is 6 chains) are required to deal with the conflicts, which significantly impact the lookup speed and memory consumption of hash table. Instead, as shown in Fig. 8(b), Pyramid-NN maps more uniformly, and the chains required are less and shorter, which means better memory utilization and lookup speed.
|# of Names||Pyramid-NN||MD5||SHA1||CityHash256|
|# of Names||Pyramid-NN||MD5||SHA1||CityHash256|
Afterwards, the empty slot ratio (i.e., the proportion of empty slots in the total number of slots in memory) of Pyramid-NN and the hash functions is tested respectively when mapping 0.5-, 1-, 1.5- and 2-million names to 2 million slots. As shown in Fig. 9(a) and Table III, Pyramid-NN has lower empty slot ratio compared with the hash functions and the gap between the two increases further as the number of names increases. The reason is that the mapping through Pyramid-NN is relatively uniform, while through the hash functions are nonuniform and things get worse as the load factor increases.
|# of Names||Pyramid-NN||MD5||SHA1||CityHash256|
|500,000||2.733 10||2.683 10||3.010 10||1.060 10|
|1,000,000||5.176 10||4.935 10||5.522 10||1.952 10|
|1,500,000||7.644 10||7.167 10||8.036 10||2.802 10|
|2,000,000||10.188 10||9.592 10||10.533 10||3.665 10|
Faced with the current network requirement that the packet loss rate should be under 1% , the number of slots required to control the false positive probability under 1% is tested with 0.5-, 1-, 1.5- and 2-million names. As illustrated in Fig. 9(b) and Table IV, the hash functions require 50 more slots than the input names, but Pyramid-NN only needs about 10, which makes memory much more efficient. As the number of names increases, the performance of Pyramid-NN declines slightly, for more outputs require Pyramid-NN to have better prediction performance which can be done by increasing the number of BPNNs or levels. However, for the 2 million names, the two-level Pyramid-NN still performs much better than hash functions, where it requires only 25% of the slots compared to the hash functions.
Vi-B Probability of False Positive
Fig. 10 shows the probability of false positive under different load factors with 0.5 million names. Compared with the hash functions, the false positive probability of Pyramid-NN is much lower and the gap between the two increases further as the load factor decreases. More detailed results are given in Table V. When the load factor is 1/8, the probability of false positive has been reduced to less than 1% for Pyramid-NN, but still up to about 6% for the hash functions. The cause lies in that Pyramid-NN achieves more uniform mapping compared with the hash functions (also as illustrated in Fig. 8) which means fewer conflicts and lower false positive probability.
Vi-C Model Size
The model size is the size of all the model parameters that need to be stored. As each BPNN consists of 5 input neurons, 20 hidden neurons and 1 output neuron, the weight and bias matrix for input-to-hidden connections in one BPNN is 20 5 and 20 1 in size respectively, while for hidden-to-output connections is 1 20 and 1 1. As all the model parameters are double-precision floating points of 8 bytes, the size of one BPNN is (20 5 + 20 1 + 1 20 + 1 1) 8 B = 1,128 B. Further, Pyramid-NN contains 1,001 BPNNs in total, so the total size of Pyramid-NN is 1,128 B 1,001 1.129 MB.
Vi-D Execution Speed
The execution speed is tested against 0.5-, 1-, 1.5- and 2-million names for Pyramid-NN and the hash functions. As listed in Table VI, thanks to the small enough size of Pyramid-NN, the number of clock cycles taken with 0.5-, 1-, 1.5- and 2-million names is only 2.733 10, 5.176 10, 7.644 10 and 10.188 10 respectively, which is an acceptable high speed on the same order of magnitude as that of the hash functions. Therefore, it is feasible to use Pyramid-NN in the index design for NDN forwarding plane, which can satisfy the current network requirements for fast packet processing.
Vii Evaluation and Discussion
In the section, the performance of LNI-based FIB, called LNI-FIB, is evaluated in terms of the memory consumption, the probability of false positive and the throughput. These results are compared with HT-FIB  as the essence of hash table-based FIB is mapping the data indexed by using hash function. Although the focus of this paper is on the hash table, for comprehensive comparison, the admitted efficient Binary Patricia Trie-FIB  and bloom filter-based B-MaFIB  are also tested to compare the performance.
Vii-a Experimental Setup
The experimental setup and the dataset are the same as described in Subsection IV(C). All the index schemes are implemented in C++ for a fair comparison.
For LNI-FIB, it is composed of two LNIs . The hyperparameter setting and generation of Pyramid-NN is the same as described in Section IV, while the size of each slot in the Enhanced Bitmap is set to 2 bytes. For Binary Patricia Trie-FIB, one entry has 4 bytes which is a pointer to access the memory storing actual packet information. For HT-FIB, it is implemented with MD5, SHA1 and CityHash256 as the hash function respectively, while one entry also has 4 bytes. For B-MaFIB, the size of Bloom filter is 2 bits and the size of MA is 24 bits, while the size of each slot in the Bitmap is 2 bytes.
|level||# of epoches||target error||minimum gradientand|
|# of Names||LNI-FIB||Binary Patricia Trie-FIB||HT with MD5-FIB||HT with SHA1-FIB||HT with CityHash256-FIB||B-MaFIB|
|Device||Single-chip Size||#||Total Size||Lookup Time|
|TCAM||2.384 MB||4||9.537 MB||2.7 ns|
|SRAM||32.187 MB||4||128.746 MB||0.47 ns|
|DRAM||3.725 GB||4-8||14.901-29.802 GB||50 ns|
|# of Names||LNI-FIB||HT with MD5-FIB||HT with SHA1-FIB||HT with CityHash256-FIB||B-MaFIB|
|# of Names||
Vii-B Memory Consumption
If the index can be stored in small and fast memories (e.g., SRAM), the routers will easily complete fast packet forwarding. To determine which indexes can be deployed on SRAMs, the memory consumption of LNI-FIB, Binary Patricia Trie-FIB, HT-FIB and B-MaFIB under the condition of 1% false positive probability is compared and analyzed in different numbers of names.
As shown in Fig. 11, LNI-FIB has lower memory consumption than that of the others, since it improves the memory utilization by mapping more uniformly (as shown in Fig. 8). When the number of names is 2 million, the memory consumption of LNI-FIB is 58.258 MB, which is 10% less than that of Binary Patricia Trie-FIB. And as for HT-FIB and B-MaFIB, they map data to slots in memory randomly and have a large number of conflicts which result in more memory consumption to reduce the conflicts. In contrast, compared with HT-FIB and B-MaFIB, the LNI-FIB with higher memory utilization significantly decreases the memory consumption by 85% and 97%, respectively.
More detailed results are given in Table VIII. The memory consumption of LNI-FIB includes the model parameters of Pyramid-NN and the slots in the Enhanced Bitmap. First, the model parameters of Pyramid-NN consume 1.129 MB as indicated in Subsection VI(C). Thus the memory consumption of two Pyramid-NNs is 1.129 MB 2 = 2.258 MB. Further, the number of slots required in Enhanced Bitmaps is the same as listed in Table IV, which consume 8 MB, 20 MB, 34 MB and 56 MB as the number of names is 0.5-, 1-, 1.5- and 2-million. Consequently, the on-chip memory consumption of LNI-FIB is 10.258 MB, 22.258 MB, 36.258 MB, and 58.258 MB respectively. As listed in Table IX, a line card can be configured with four channels of 32.187 MB single-chip SRAMs for a total of 128.746 MB in size . Table VIII shows the memory consumption in megabytes, thus LNI-FIB can easily fit into contemporary SRAMs in commercial line card. In contrast, the memory consumption of HT-FIB and B-MaFIB limits its deployment on SRAMs.
Vii-C Probability of False Positive
Given the current network requirements that the packet loss rate should be under 1% , the false positive probability of LNI-FIB, HT-FIB and B-MaFIB with 32 million slots is compared and analyzed in different number of names.
As shown in Fig. 12, the false positive probability of LNI-FIB is 0.817% as the number of names is 2 million, which is less than a third and a ninth of that of HT-FIB and B-MaFIB, respectively. The reason is that LNI-FIB maps data to slots more uniformly, but as for the other, the mapping is relatively nonuniform and has a large number of conflicts.
More detailed results are given in Table X. The false positive probability of LNI-FIB is approximately equal to 0.078%, 0.243%, 0.488% and 0.817% as the number of names is 0.5-, 1-, 1.5- and 2-million, which is lower than 1%, meeting the current network requirements. In comparison, the false positive probability of HT-FIB and B-MaFIB is much higher than that of LNI-FIB.
Faced with the current network requirement for fast packet processing, the throughput in execution of LNPM is tested against 0.5-, 1-, 1.5- and 2-million names for LNI-FIB, Binary Patricia Trie-FIB, HT-FIB and B-MaFIB.
The comparison is shown in Fig. 13, LNI-FIB outperforms the others in throughput due to the deployment on SRAMs. For 2 million names, the throughput of LNI-FIB is 177.37 MSPS, which is about 26% more than that of Binary Patricia Trie-FIB. The reason is that each traversing from one level to the next one in Binary Patricia Trie-FIB requires one memory access and that the average height of it is much higher than traditional trie reduces its lookup speed. And compared with HT-FIB and B-MaFIB, the throughput of LNI-FIB is about 100, 110, 38 and 20 more than that of HT with MD5-FIB, HT with SHA1-FIB, HT with CityHash256-FIB and B-MaFIB respectively. Because HT-FIB and B-MaFIB have large footprint, so that they have to be deployed on DRAM, which significantly reduces its throughput. Thus, HT-FIB and B-MaFIB cannot well meet the current network requirement for packet processing.
More detailed experimental results are given in Table XI. For 0.5-, 1-, 1.5- and 2-million names, the throughput of LNI-FIB provides a higher throughput of about 162.16 MSPS, 173.37 MSPS, 176.90 MSPS and 177.37 MSPS, as multi-level Pyramid-NN in LNI-FIB consists of simple BPNNs with small size and can run in parallel, which can be executed fast in NDN forwarding plane.
As LNI-FIB based on Pyramid-NN can learn the distributions of name indexed in the static memory, LNI-FIB can reduce the memory consumption and the probability of false positive to 58.258 MB and 0.817% respectively for 2 million names. And because it can be deployed on SRAMs, the throughput is about 177 MSPS, which is much better than that of HT-FIB and B-MaFIB. More importantly, LNI-FIB can adapt to the distributions of real NDN names by retraining with the names without redesign, which can not only save the engineering cost but ensure the efficient performance. Therefore, the performance of LNI-FIB will be more stable and better than that of the schemes evaluated with URLs, such as the recognized high-performance Binary Patricia Trie-FIB.
Viii Conclusion and Future work
This paper proposed a smart mapping model via neural networks called Pyramid-NN to build an index, named LNI, which can deal with the change of data distributions without re-engineering. The performance of LNI-FIB is evaluated. And the experimental results show that its performance in terms of memory consumption, false positive probability and throughput can be significantly improved by utilizing neural network and can meet the current network requirement well for fast packet processing. In the future, through retraining to learn the distributions of real names indexed in the static memory, LNI can not only save the engineering cost, but also ensure its efficient performance.
A promising future direction would be to extend this design to the build of an engine running on multiple parallel threads with real packets Another future work is exploring more efficient neural networks to map names.
-  L. Zhang et al., “Named data networking (NDN) project,” Xerox Palo Alto Research Center-PARC, Tech. Rep. NDN-0001, 2010, [Online]. Available: http://named-data.net/.
-  ——, “Named data networking,” ACM SIGCOMM Computer Communication Review, vol. 44, no. 3, pp. 66–73, 2014.
-  Y. Cheng, A. Afanasyev, I. Moiseenko, B. Zhang, L. Wang, and L. Zhang, “Smart forwarding: A case for stateful data plane,” Tech. Rep. NDN-0002, 2012, [Online]. Available: http://named-data.net/.
-  Z. Li, Y. Xu, B. Zhang, L. Yan, and K. Liu, “Packet forwarding in named data networking requirements and survey of solutions,” IEEE Communications Surveys and Tutorials, vol. 21, no. 2, pp. 1950–1987, 2018.
-  G. Carofiglio, M. Gallo, L. Muscariello, and D. Perino, “Pending interest table sizing in named data networking,” in Proceedings of the 2nd International Conference on Information-Centric Networking, San Francisco, CA, USA, 2015, pp. 49–58.
-  E. Fredkin, “Trie memory,” Communications of the ACM, vol. 3, no. 9, pp. 490–499, 1960.
-  T. Zink, “A survey of hash tables with summaries for ip lookup applications,” 2009, [Online]. Available: http://nbn-resolving.de/urn:nbn:de:bsz:352-175851.
-  B. H. Bloom, “Space/time trade-offs in hash coding with allowable errors,” Communications of the ACM, vol. 13, no. 7, pp. 422–426, 1970.
-  W. Pugh, “Skip lists: A probabilistic alternative to balanced trees,” Communications of the ACM, vol. 33, no. 6, pp. 668–676, 1990.
-  “Blacklist.” [Online]. Available: http://www.shallalist.de.
-  D. Rumelhart, G. Hinton, and R. Williams, “Learning representations by back propagating errors,” Nature, vol. 323, no. 6088, pp. 533–536, 1986.
-  Z. Li, Y. Xu, K. Liu, X. Wang, and D. Liu, “5g with b-mafib based named data networking,” IEEE Access, vol. 6, pp. 30 501–30 507, 2018.
-  C. Ghasemi, H. Yousefi, K. G. Shin, and B. Zhang, “A fast and memory-efficient trie structure for name-based packet forwarding,” in 2018 IEEE 26th International Conference on Network Protocols (ICNP), Cambridge, UK, 2018, pp. 302–312.
-  H. Dai and B. Liu, “Consert: Constructing optimal name-based routing tables,” Computer Networks, vol. 94, pp. 62–79, 2016.
-  J. Lee and H. Lim, “A new name prefix trie with path compression,” in IEEE International Conference on Consumer Electronics-Asia, Seoul, Korea, 2016, pp. 1–4.
-  T. Song, H. Yuan, P. Crowley, and B. Zhang, “Scalable name-based packet forwarding: From millions to billions,” in Proceedings of the 2nd International Conference on Information-Centric Networking, San Francisco, CA, USA, 2015, pp. 19–28.
-  M. Liu, T. Song, Y. Yang, and B. Zhang, “A unified data structure of name lookup for ndn data plane,” in Proceedings of the 4th ACM Conference on Information-Centric Networking, Berlin, Germany, 2017, pp. 188–189.
H. Yuan, “Data structures and algorithms for scalable ndn forwarding,”
Dissertations and Theses - Gradworks, 2015, [Online]. Available:
-  J. Seo and H. Lim, “Bitmap-based priority-npt for packet forwarding at named data network,” Computer Communications, vol. 130, pp. 101–112, 2018.
-  A. Afanasyev, J. Shi, B. Zhang, L. Zhang, I. Moiseenko, Y. Yu, W. Shang, Y. Huang, J. P. Abraham, S. DiBenedetto et al., “Nfd developer’s guide,” no. NDN-0021, 2016, [Online]. Available: http://named-data.net/.
-  D. Saxena, V. Raychoudhury, C. Becker, and N. Suri, “Reliable memory efficient name forwarding in named data networking,” in 2016 IEEE Intl Conference on Computational Science and Engineering, Paris, France, 2016, pp. 48–55.
-  D. Saxena and V. Raychoudhury, “Radient: Scalable, memory efficient name lookup algorithm for named data networking,” Journal of Network and Computer Applications, vol. 63, pp. 1–13, 2016.
-  D. Li, J. Li, and Z. Du, “An improved trie-based name lookup scheme for named data networking,” in Computers and Communication (ISCC), Messina, Italy, 2016, pp. 1294–1296.
-  D. Saxena and V. Raychoudhury, “N-fib: Scalable, memory efficient name-based forwarding,” Journal of Network and Computer Applications, vol. 76, pp. 101–109, 2016.
-  S. H. Bouk, S. H. Ahmed, and D. Kim, “Hierarchical and hash based naming with compact trie name management scheme for vehicular content centric networks,” Computer Communications, vol. 71, pp. 73–83, 2015.
-  S. Feng, M. Zhang, R. Zheng, and Q. Wu, “A fast name lookup method in ndn based on hash coding,” in International Conference on Mechatronics and Industrial Informatics, Cambridge, United Kingdom, 2015, pp. 575–580.
-  W. Quan, C. Xu, A. V. Vasilakos, J. Guan, H. Zhang, and L. A. Grieco, “Tb2f: Tree-bitmap and bloom-filter for a scalable and efficient name lookup in content-centric networking,” in 2014 IFIP Networking conference, Trondheim, Norway, 2014, pp. 1–9.
-  Y. Wang, Y. Zu, T. Zhang, K. Peng, Q. Dong, B. Liu, W. Meng, H. Dai, X. Tian, Z. Xu, H. Wu, and D. Yang, “Wire speed name lookup: A gpu-based approach,” in Usenix Conference on Networked Systems Design and Implementation, Lombard, IL, USA, 2013, pp. 199–212.
-  Y. Wang, K. He, H. Dai, W. Meng, J. Jiang, B. Liu, and Y. Chen, “Scalable name lookup in ndn using effective name component encoding,” in 2012 IEEE 32nd International Conference on Distributed Computing Systems (ICDCS), Macau, China, 2012, pp. 688–697.
-  H. Dai, B. Liu, Y. Chen, and Y. Wang, “On pending interest table in named data networking,” in Proceedings of the eighth ACM/IEEE symposium on Architectures for networking and communications systems, Austin, TX, USA, 2012, pp. 211–222.
-  Y. Wang, H. Dai, J. Jiang, K. He, W. Meng, and B. Liu, “Parallel name lookup for named data networking,” in Global Telecommunications Conference (GLOBECOM 2011), Houston, USA, 2011, pp. 1–5.
-  H. Yuan and P. Crowley, “Reliably scalable name prefix lookup,” in Symposium on Architectures for Networking and Communications Systems, Oakland, CA, USA, 2015, pp. 111–121.
-  T. Shen, X. Zhang, G. Xie, Y. Meng, and D. Zhang, “Code: Fast name lookup and update using conflict-driven encoding,” in 2018 IEEE 37th International Performance Computing and Communications Conference (IPCCC), Orlando, Florida, USA, 2018, pp. 1–8.
-  J. Hu, T. Huang, and H. Li, “Fast and scalable name prefix lookup with hash table,” in Proceedings of the ACM SIGCOMM 2019 Conference Posters and Demos, Beijing, China, 2019, pp. 131–133.
-  M. Varvello, D. Perino, and L. Linguaglossa, “On the design and implementation of a wire-speed pending interest table,” in Computer Communications Workshops, Turin, Italy, 2013, pp. 369–374.
-  H. Yuan, T. Song, and P. Crowley, “Scalable ndn forwarding: Concepts, issues and principles,” in International Conference on Computer Communications and Networks, Munich, Germany, 2012, pp. 1–9.
-  Y. Thomas, G. Xylomenos, C. Tsilopoulos, and G. C. Polyzos, “Object-oriented packet caching for icn,” in Proceedings of the 2nd International Conference on Information-Centric Networking, San Francisco, CA, USA, 2015, pp. 89–98.
-  H. Yuan and P. Crowley, “Scalable pending interest table design: From principles to practice,” in IEEE INFOCOM, Toronto, Canada, 2014, pp. 2049–2057.
-  W. So, A. Narayanan, and D. Oran, “Named data networking on a router: Fast and dos-resistant forwarding with hash tables,” in Proceedings of the ninth ACM/IEEE symposium on Architectures for networking and communications systems, San Jose, CA, USA, 2013, pp. 215–226.
-  Y. Wang, D. Tai, T. Zhang, J. Lu, B. Xu, H. Dai, and B. Liu, “Greedy name lookup for named data networking,” ACM SIGMETRICS Performance Evaluation Review, vol. 41, no. 1, pp. 359–360, 2013.
-  W. So, A. Narayanan, D. Oran, and Y. Wang, “Toward fast ndn software forwarding lookup engine based on hash tables,” in Proceedings of the eighth ACM/IEEE symposium on Architectures for networking and communications systems, Austin, TX, USA, 2012, pp. 85–86.
-  Y. Wang, T. Pan, Z. Mi, H. Dai, X. Guo, T. Zhang, B. Liu, and Q. Dong, “Namefilter: Achieving fast name lookup with low memory cost via applying two-stage bloom filters,” in IEEE INFOCOM, Turin, Italy, 2013, pp. 95–99.
-  W. You, B. Mathieu, P. Truong, J.-F. Peltier, and G. Simon, “Dipit: A distributed bloom-filter based pit table for ccn nodes,” in 2012 21st International Conference on Computer Communications and Networks, Munich, Germany, 2012, pp. 1–7.
-  H. Dai, J. Lu, Y. Wang, T. Pan, and B. Liu, “Bfast: High-speed and memory-efficient approach for ndn forwarding engine,” IEEE/ACM Transactions on Networking, 2016.
-  Z. Li, K. Liu, D. Liu, H. Shi, and Y. Chen, “Hybrid wireless networks with fib-based named data networking,” EURASIP Journal on Wireless Communications and Networking, vol. 2017, no. 1, p. 54, 2017.
-  Z. Li, K. Liu, Y. Zhao, and Y. Ma, “Mapit: an enhanced pending interest table for ndn with mapping bloom filter,” IEEE Communications Letters, vol. 18, no. 11, pp. 1915–1918, 2014.
-  R. Hou, L. Zhang, T. Wu, T. Mao, and J. Luo, “Bloom-filter-based request node collaboration caching for named data networking,” Cluster Computing, no. 4, pp. 1–12, 2018.
-  C. Munoz, L. Wang, E. Solana, and J. Crowcroft, “I(fib)f: Iterated bloom filters for routing in named data networks,” in International Conference on Networked Systems, Marrakech, Morocco, 2017, pp. 1–8.
-  R. Zhang, J. Liu, T. Huang, T. Pan, and L. Wu, “Adaptive compression trie based bloom filter: Request filter for ndn content store,” IEEE Access, vol. PP, no. 99, p. 1, 2017.
-  J. Lee, M. Shim, and H. Lim, “Name prefix matching using bloom filter pre-searching for content centric network,” Journal of Network and Computer Applications, vol. 65, pp. 36–47, 2016.
-  K. Shimazaki, T. Aoki, T. Hatano, T. Otsuka, A. Miyazaki, T. Tsuda, and N. Togawa, “Hash-table and balanced-tree based fib architecture for ccn routers,” in 2016 International SoC Design Conference (ISOCC), Jeju, South Korea, 2016, pp. 67–68.
V. Manghnani, “Length indexed bloom filter based forwarding in content
centeric networking,” 2016, [Online]. Available:
-  W. Yu and D. Pao, “Hardware accelerator to speed up packet processing in ndn router,” Computer Communications, vol. 91, pp. 109–119, 2016.
-  D. Perino, M. Varvello, L. Linguaglossa, R. Laufer, and R. Boislaigue, “Caesar: a content router for high-speed forwarding on content names,” in Proceedings of the tenth ACM/IEEE symposium on Architectures for networking and communications systems, Los Angeles, CA, USA, 2014, pp. 137–148.
-  W. Quan, C. Xu, J. Guan, H. Zhang, and L. A. Grieco, “Scalable name lookup with adaptive prefix bloom filter for named data networking,” IEEE Communications Letters, vol. 18, no. 1, pp. 102–105, 2014.
-  M. Fukushima, A. Tagami, and T. Hasegawa, “Efficient lookup scheme for non-aggregatable name prefixes and its evaluation,” IEICE Transactions on Communications, vol. 96, no. 12, pp. 2953–2963, 2013.
-  Z. Li, J. Bi, S. Wang, and X. Jiang, “Compression of pending interest table with bloom filter in content centric network.” in International Conference on Future Internet Technologies, Seoul, South Korea, 2012, p. 46.
-  T. Pan, T. Huang, J. Liu, J. Zhang, F. Yang, S. Li, and Y. Liu, “Fast content store lookup using locality-aware skip list in content-centric networks,” in Computer Communications Workshops, San Francisco, CA, USA, 2016, pp. 187–192.
-  L. Yan, Z. Li, and K. Liu, “Learning tree: Neural network-based index for ndn forwarding plane,” in Proceedings of the ACM SIGCOMM 2019 Conference Posters and Demos, Beijing, China, 2019, pp. 63–65.
-  Q. Wang, Q. Wu, M. Zhang, R. Zheng, and J. Zhu, “Learned bloom-filter for an efficient name lookup in information-centric networking,” in 2019 IEEE Wireless Communications and Networking Conference (WCNC). IEEE, 2019, pp. 1–6.
-  V. Jacobson et al., “Named data networking next phase (NDN-NP) project,” Xerox Palo Alto Research Center-PARC, Tech. Rep. May 2016-April 2017 Annual Report, 2017, [Online]. Available: http://named-data.net/.
-  C. Yi, A. Afanasyev, L. Wang, B. Zhang, and L. Zhang, “Adaptive forwarding in named data networking,” ACM SIGCOMM computer communication review, vol. 42, no. 3, pp. 62–67, 2012.
-  Y. Yu, A. Afanasyev, Z. Zhu, and L. Zhang, “Ndn technical memo: Naming conventions,” NDN, NDN Memo, Technical Report NDN-0023, 2014.
-  A. Kirsch, M. Mitzenmacher, and G. Varghese, “Hash-based techniques for high-speed packet processing,” in Algorithms for Next Generation Networks. Springer, 2010, pp. 181–218.
-  “The city family of hash functions,” [Online]. Available: http://code.google.com/p/cityhash.