My short answer?
Here is a long explanation with my data and model.
Since the end of January 2020, I have changed my V2Ray server settings from vmess over TCP to vmess over TLS + WebSocket.
I deployed the V2Ray server to two different cloud providers with different UUID, different IP, different ports, different domain names, different OS flavors, and different TLS certificates. The traffic data form Google Compute Engine (GCE) is for training and validation, while the one from Amazon Light Sail (LS) is for testing. As you can see from the traffic file below, there is a time gap in GCE data because there was a power outage in my neighborhood and my router didn’t hook up to a UPS. I didn’t realize it until a week later.
drwxr-xr-x 2 Ricky Ricky 16M Mar 1 16:31 tcpsorter.20200129.20200206.GCE.TLS.WS -rw-r--r-- 1 Ricky Ricky 1.4G Feb 29 23:35 tcpsorter.20200129.20200206.GCE.TLS.WS.tar.gz drwxr-xr-x 2 Ricky Ricky 34M Mar 1 16:23 tcpsorter.20200213.20200229.GCE.TLS.WS -rw-r--r-- 1 Ricky Ricky 3.3G Feb 29 23:38 tcpsorter.20200213.20200229.GCE.TLS.WS.tar.gz drwxr-xr-x 2 Ricky Ricky 14M Mar 11 17:58 tcpsorter.20200304.20200311.LS.TLS.WS -rw-r--r-- 1 Ricky Ricky 1.3G Mar 11 17:49 tcpsorter.20200304.20200311.LS.TLS.WS.tar.gz
Here is the summary of the data from GCE. I split 80% of the data for training and 20% of the data for validation:
Statistics: Total V2ray traffic 32005, Total non-V2ray traffic 346431 Output train traffic 51208, Total validation traffic 12802
Here is the summary of the data from LS. I used all of them to test the model trained by the GCE data. The previously trained model knows nothing about the data from LS. There is no data leak.
Statistics: Total V2ray traffic 10872, Total non-V2ray traffic 76316 Output train traffic 87188, Total validation traffic 0
When I trained the model with GCE data, I used the early stopping technique which minimizes the validation loss. Within the 1st epoch, it stops with 0.9999 accuracy.
Epoch 1/1 1600/1600 [==============================] - 603s 377ms/step - loss: 0.0068 - accuracy: 0.9981 - val_loss: 7.2060e-07 - val_accuracy: 0.9999
The ROC curve of validation data looks perfect.
Then, I collected data from AWS Light Sail for a week. I loaded the model to do inference only with the LS data. See the result below:
# Create evaluation generator eval_generator = PacketDataGenerator(eval_file_list, shuffle=False) eval_result = model.evaluate_generator(eval_generator,workers=3, use_multiprocessing=True, verbose=1) 2724/2724 [==============================] - 18s 6ms/step print(eval_result) [7.367991372575489e-08, 0.9997590780258179]
The ROC curve of test data looks perfect. The accuracy is 0.999759.
I didn’t release the Python notebook this time. Because there are nothing changes compared to the previous vmess over TCP notebook except the data are different. But I did upload the trained model to my Github repo.
Note that my non-V2Ray traffic contains a variety of traffic types that pass-through my home router. For example, the V2Ray server uses the port other than 443. So I can count the number of possible HTTPS traffic below:
[Ricky@gtx tcpsorter.20200304.20200311.LS.TLS.WS]$ ls -alh | grep "\.443\.bin$" | wc -l 70616
No, V2Ray with TLS can NOT blend in other HTTPS traffic. You are still exposed like using vmess over TCP.
I’m in the procrastination loop lately due to the pollen season. Even after taking non-drowsy anti-allergy medicine, I’m still sleepy. But I will keep up to look for a solution.
There is no surprise to this result, there has long been an issue that discussed over such an issue, you may refer to this: https://github.com/v2ray/v2ray-core/issues/1660
This may indicate that your request was not one sent by a web browser, but it does NOT guarantee that you can tell data transfer from other application that has the same signature apart.
The training and the validation data set include https traffic and also other TLS traffic such as Netflix. The V2Ray traffic wrapped inside TLS can be classified from TLS traffic.
The TLS handshake of v2ray has been found buggy. https://github.com/v2ray/v2ray-core/issues/2509
This also reminds me if some of the features are actually originated from the default TLS Library of Golang? They may have different behaviour thus been dedected by the model. I propose you can some comparisons in terms of SSL/TLS library.
Thanks for letting me know. I’m not an expert on TLS. I’m reading a paper here.
The current fix in V2Ray is not quite acceptable. It simply removed the list of the hard coded cipher and replace them with default Go TLS cipher list.
It may blend in V2Ray traffic as Go app traffic. Given the low popularity of Go app as client anywhere in the world, it is going to be busted as well.
In any case, I’d expect the migration of ults.
Once they are done with a real fix, I will retrain the CNN model and see if it can classify them.