In their daily work, engineers in semiconductor Failure Analysis (FA) laboratories generate numerous documents, recording the tasks, findings, and conclusions related to every device they handle. This data stores valuable knowledge for the laboratory that other experts can consult, but being in the form of a collection of documents pertaining to particular devices and their processing history makes it difficult if not practically impossible to find answers to specific questions. This paper therefore proposes a Natural Language Processing (NLP) solution to make the gathering of FA knowledge from numerous documents more efficient. It explains how the authors generated a dataset of FA reports along with corresponding electrical signatures and physical failures in order to train different machine-learning algorithms and compare their performance. Three of the most common classification algorithms were used in the study: K-Nearest Neighbors (kNN), Support Vector Machines (SVM), and Deep Neural Networks (DNN). All of the classification models produced were able to capture patterns associated with different types of failures and predict the causes. The outcomes were best with the SVM classifier and all classifiers did slightly better in regard to physical faults. The reasons are discussed in the paper, which also provides suggestions for future work.