Advanced Capsule Networks via Context Awareness
Capsule Networks (CN) offer new architectures for Deep Learning (DL) community. Though demonstrated its effectiveness on MNIST and smallNORB datasets, the networks still face a lot of challenges on other datasets for images with different levels of background. In this research, we improve the design of CN (Vector version) and perform experiments to compare accuracy and speed of CN versus DL models. In CN, we resort to more Pooling layers to filter Input images and extend Reconstruction layers to make better image restoration. In DL models, we utilize Inception V3 and DenseNet V201 for demanding computers beside NASNet, MobileNet V1 and MobileNet V2 for small and embedded devices. We evaluate our models on a fingerspelling alphabet dataset from American Sign Language (ASL). The results show that CNs perform comparably to DL models while dramatically reduce training time. We also make a demonstration for the purpose of illustration.
READ FULL TEXT