Classification of Bad Type of Semiconductor Wafer Processing Using Deep learning model

I conducted an industry-academic cooperation project with Hyundai Mobis Vander Shinhan Precision Industry Co., Ltd.

My name is Mingyo Kim. I am an industrial management engineer and have experience in researching in the industrial intelligence lab.

Let me start my research presentation.

COVID19 has affected semiconductor production by semiconductor manufacturers.

After the decline in semiconductor production became a fait accompli, semiconductor manufacturers are focusing on producing high-end semiconductors with higher unit prices among high-end semiconductors and low-cost semiconductors they used to be produced.

As a result of focusing on producing high-end semiconductors, a few months later, there has been a shortage of low-cost semiconductors, and an issue of lack of supply of semiconductors for vehicles, a type of low-cost semiconductor, has occurred.

In addition, the automotive industry is changing from internal combustion engine cars to electric car manufacturing.

This chain of semiconductor shortages has expanded around the world, and Taiwan's TSMC has announced that it will increase the supply of semiconductors for vehicles, but it is unclear at the moment.

This is the recent issue of responding to the lack of supply of automotive semiconductors. Companies around the world, including Hyundai Motor, have taken measures to cut production.

If I cooperate with Hyundai Mobis Vander Shinhan Precision Industrial Co., Ltd. to present a solution to the semiconductor industry, I will be able to solve the problem of lack of supply of semiconductors both domestically and globally.

Also, I conducted research activities to attract new customers using the service of Shinhan Precision Industry Co., Ltd.

The effect of determining defective types of semiconductor wafers using artificial intelligence can be identified by identifying defective types of semiconductors and identifying the ratio of defective types to each type, and thus increasing yields can be expected.

The yield is calculated by multiplying the actual number of normal chips by 100 by the maximum number of chips designed.

The utilization data is WM-811K wafer map and the data has 811,457 sets of data.

Contains information such as wafer map, wafer die size, lot name, wafer index, etc.

If you look at the data, you can see that it also contains data that is not labeled.

I will use the labeled data except for the unlabeled data. The data are unbalanced.

It's part of the result of visualizing the data.

The data has a total of eight bad types: Center, Donut, Edge-Loc, Edge-Ring, Loc, Random, Scratch, Near-full.

I will compare the performance of SVM, CNN, and RF models.

▶Support vector machine

I will use the support vector machine to classify defect types.

Based on a given dataset, the SVM algorithm creates a non-probable binary linear classification model that determines which category the new data will fall into.

The created classification model is represented by boundaries in the space where data is thought, and the SVM algorithm is the algorithm that finds boundaries with the largest width of them.

In addition to linear classification, SVMs can also be used in nonlinear classification.

In order to classify non-linearly, it is necessary to think of the given data as a high-dimensional feature space, and in this study, the curvilinear integral was carried out using radon conversion.

The wafer map was divided into 13 parts and the defect density was calculated accordingly.

This chart is the density information for eight failure types, which shows the partial density of wafers divided by 13.

At this time, we will use radon conversion, a method used mainly for CT scans in hospitals, as a way to eliminate noise.

Radon conversion is one of the image processing techniques used to convert tomography data into two-dimensional images in medical and geological fields.

If you want to find a particular shape or pattern on a dimension image, it is a method that adds some additional processing, applying radon conversion and inverse conversion techniques to the image data.

Calculates the integral value of the image's pixels along a vertical line and a line that leads to a position at a certain angle and distance from the origin of the two-dimensional image. That means combining all the pixel values on the line.

This figure shows the results of the radon conversion for eight types of failures.

However, this conversion result is not immediately available because the wafer map size is different.

In the next step, you will obtain fixed dimension geometry values for row means and row standard deviations in the radon transformation.

Dimensions are fixed at 20.

A total of 40 dimensions are extracted from radon-based features.

The figure on the left is a radon-based feature based on the row mean and the figure on the right is a radon-based feature separated from the row standard deviation.

Using radon-based geometry, the most prominent area identification utilized noise filtering.

The region-labeling algorithm was used, and the maximum region was selected as the most prominent region.

Based on prominent regions, we aimed to extract geometric features such as area, perimeter, length of main axis, length of secondary axis, robustness, and eccentricity.

I implemented multi-class and multi-label learning algorithms with SVM, and the ACC is 82%.

▶Convolutional neural network

Convolutional neural networks are a type of multi-layer feed-forward artificial neural network used to analyze visual images.

It is classified as a deep neural network in deep learning, and is mainly used for visual image analysis.

It is applied to image and video recognition, recommendation system, image classification, medical image analysis, and natural language processing.

In feedforward neural networks, passing through the hidden layer means that the input of the hidden layer is weighted and then biased to become the input of the activation function.

We confirmed that there are 8 types of defective types: center, donut, edgelock, edgering, rock, random, scratch, and near-pool.

As seen in the data description section, the data we currently use has an imbalance problem.

Therefore, we will proceed with data augmentation to solve the problem of data imbalance.

When expanding data, I thought about GAN and autoencoder.

Although GAN was created by an autoencoder, it focused on the characteristics of increasing the number of learning data that Autoencoder lacks more effectively, and selected Autoencoder.

We will use a 2D convolutional autoencoder that extends dimensions.

An encoder model and a decoder model were created as part of the autoencoder model layer.

This is to add noise.

Add noise to encoded defective wafer vectors.

Wafer data reconstructed by adding new noise wafer data to the original faulty wafer data.

Through this process, we increased 2000 data for each defective type.

This solved the problem of data imbalance.

The step to define a model generation function and validate the model through cross-validation.

For the cross-validation model, K Fold Cross validation method was used.

K Fold is a method of making k-folds and proceeding with verification by the specified number of ks.

The reason for using Kfold is that accuracy can be improved for datasets with a small total number of data.

In our model, we designated it as k=3, and the score was 97%.

This is the result of CNN model.

The accuracy of the CNN model was 99% and, as shown in the table on the right, ACC is increasingly increasing, converging to 1, and Loss converging to zero.

▶Random forest

The third model we proceeded with is Random Forest.

Random forest is an algorithm that is often used in the field due to its simplified principles and better performance than most design trees.

We conducted a random forest to compare the performance of the two models that we proceeded before.

Deep learning was performed using theano library and keras.

I separated the training set and the test set.

Accuracy was predicted using the accuracy_score module.

Set the last node count of the decision tree differently to 2 and 50.

You can see the percentage of all predicted targets that you classify correctly.

When predicted using random forests, we could see that it was predicted with approximately 89% accuracy and that it had better performance than the decision tree.

In the next step, we checked with confusion matrix to see if Random Forest is the right model.

You can see that the 'Near-full' defect labeled number 7 is the most predictive.

▶Conclusion◀

Accuracy of SVM, CNN, and Random Forest.

CNN showed the highest accuracy with 99%, followed by Random Forest with 89%, and SVM 82%.

Let me announce the results of the study. First, we promoted the increase in yield by identifying the type of defect.

High yields are the most important in the semiconductor industry, so among the defective types that lower yields, those that need process improvement can be improved.

The second is the improvement of the existing defect classification method.

We have derived meaningful research results by adding deep learning to the defect type classification method that was previously used by the company.

For CNN, we built a model that was 99% accurate and applicable directly to the company's system.

Third, it is easier to identify the location extraction that causes the wafer map to fail.

Since it is divided into 13 zones, it is easy to identify the defect location.

Fourth, it is to secure practicality by applying various fields.

As it is applicable to various processes where imbalance exists, it is highly likely to be practical and can be expected to be applicable to other areas other than manufacturing processes.

Hyundai Mobis Vander Shinhan Precision Industrial Co., Ltd. has expressed its position on the results of the study.

JiangHuiCheShen / Semiconductor-

Classification of Bad Type of Semiconductor Wafer Processing Using Deep learning model

▶Support vector machine

▶Convolutional neural network

▶Random forest

▶Conclusion◀

About

Languages