Why do we need such dataset?
Minimally Invasive Surgery (MIS) is a very sensitive medical procedure. A general MIS surgical procedure involves two surgeons: main surgeon and assistant surgeon. Success of any MIS procedure depends upon multiple factors, such as, attentiveness of main surgeon and assistant surgeon, competence of surgeons, effective coordination between the main surgeon and assistant surgeon. whose success depends on the competence of the human surgeons and the degree of effectiveness of their coordination.
According to Lancet Commission, each year 4.2 million people die within 30 days of surgery [nepogodiev2019global]. Another study at John Hopkins University states that 10% of total deaths in USA are due to medical error [jhustudy].
Artificial Intelligence is being used in a lot of applications where human error has to be mitigated. The proposed dataset is also one step in same direction. To make the surgical procedure safe, we should be able to identify and track the actions of main as well as assistant surgeon. This dataset is developed with the assistance of medical professionals as well as expert surgeon. More details of the data set can be found in section 4.
How this is going to be helpful to push the research? Although there a lot of datsets for action detection for action detection. But there is no existing dataset for action detection in medical computer vision. Given the complexity of the scene and difficulty in the detection of surgeon action, this dataset will set forward a path and benchmark for the medical computer vision research community. In our experiments, we found that it is very difficult to correctly localise the bounding box for any action, more discussion on this is provided in section 3.
Briefly, How do we create it?
Resulting main contributions?
2 Related work
Related endoscopic vision works?
Related endoscopic imaging datasets?
Action detection works and datasets?
3 Problem statement
Problems based data images data, localisation type of problem: no specific boundaries for action bbox
4 ESAD Dataset
The proposed dataset specifically focuses on prostatectomy procedure. We recorded four full prostatectomy procedures with the concent of the patients. In second stage we formalised the number of actions that a surgeon can perform during prostatectomy. After the thorough analysis we finalised 21 set of actions. List of actions along with number of samples is given in table 1.
The complete dataset is divided into three different sets: training, validation and test set. Training dataset has two complete prostatectomyprocedures. ESAD has 18793 annotated frames for training with a total of 27998 action instances. Class-wise distribution of samples is given in table 1. Validation data has 4576 annotated frames with 7120 action instances and the test set is comprised of 6088 annotated frames with 11207 action instances.
Instead of randomly putting samples into each of the datasets, we use complete surgeries as one set. Reason behind the choice is that we don’t want either of the sets to be biased toward one class. Secondly, choosing whole procedure as one set provides the natural rate of sample occurrence during the real procedure. As we can see in table 1, some classes have a lot more samples than the other.
Some samples from the dataset are shown in figure.