-
Notifications
You must be signed in to change notification settings - Fork 284
Expand file tree
/
Copy pathrsyslog.srt
More file actions
1200 lines (908 loc) · 20.7 KB
/
rsyslog.srt
File metadata and controls
1200 lines (908 loc) · 20.7 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1
00:00:00,847 --> 00:00:04,148
Hello, my name is Kevin Backhouse,
and I'm a security researcher
2
00:00:04,177 --> 00:00:04,860
at Semmle.
3
00:00:04,884 --> 00:00:08,139
This video is about a vulnerability that was found
4
00:00:08,173 --> 00:00:09,665
by my colleague Bas van Schaik
5
00:00:09,757 --> 00:00:11,628
in the rsyslog project.
6
00:00:12,742 --> 00:00:15,456
What I'm going to do in this video is explain a bit
7
00:00:15,511 --> 00:00:18,049
about what the vulnerability was.
8
00:00:18,083 --> 00:00:19,736
I'm also going to show you
9
00:00:20,341 --> 00:00:25,353
how you can use QL to search
10
00:00:25,398 --> 00:00:27,231
for the vulnerability--
11
00:00:27,277 --> 00:00:29,560
the coding pattern that caused this vulnerability.
12
00:00:30,264 --> 00:00:32,338
I'm going to show how easy that is to do in QL.
13
00:00:34,338 --> 00:00:41,480
Rsyslog is an open source logging system maintained
14
00:00:41,526 --> 00:00:42,827
by Rainer Gerhards.
15
00:00:45,337 --> 00:00:51,288
We reported this vulnerability to him
and he fixed it incredibly quickly.
16
00:00:51,318 --> 00:00:52,280
I think it was only
17
00:00:52,304 --> 00:00:54,808
about 24 hours between the time
18
00:00:54,846 --> 00:00:56,745
that we reported the vulnerability to him
19
00:00:56,791 --> 00:00:59,575
and the time that he had posted a fix
20
00:00:59,631 --> 00:01:01,041
for it on GitHub.
21
00:01:02,034 --> 00:01:03,262
Something I'd also
like to point out
22
00:01:03,280 --> 00:01:07,011
about rsyslog, is that Rainer takes security
23
00:01:07,057 --> 00:01:09,710
or quality incredibly seriously.
24
00:01:10,357 --> 00:01:12,639
He runs all the static analysis
25
00:01:12,668 --> 00:01:14,150
and dynamic analysis tools
26
00:01:14,648 --> 00:01:16,032
that he can get his hands on on this project.
27
00:01:16,740 --> 00:01:20,756
This was a bug that had not been detected
28
00:01:20,796 --> 00:01:22,220
by any of those other tools.
29
00:01:27,144 --> 00:01:30,633
That shows the value of Semmle's technology
30
00:01:30,761 --> 00:01:34,320
that we can find problems that get missed
31
00:01:34,375 --> 00:01:36,818
by other tools.
32
00:01:39,563 --> 00:01:46,722
Now, the backstory behind this vulnerability
is that towards the end of 2017,
33
00:01:46,775 --> 00:01:47,888
I was looking at this code
34
00:01:47,923 --> 00:01:49,073
in the curl project.
35
00:01:52,560 --> 00:01:54,993
I'd never seen this coding pattern before.
36
00:01:55,038 --> 00:01:58,666
I didn't even know that snprintf had a return value,
37
00:01:58,697 --> 00:02:01,654
so I was quite interested to see this coding pattern.
38
00:02:02,891 --> 00:02:05,427
It's quite clever the way that it's being used here.
39
00:02:05,481 --> 00:02:08,555
What they're doing is they're printing
40
00:02:08,746 --> 00:02:11,748
a list of strings into a fixed size buffer,
41
00:02:12,699 --> 00:02:14,155
and they're using
42
00:02:14,209 --> 00:02:15,858
the return value of snprintf
43
00:02:16,145 --> 00:02:19,792
to keep track of what offset
44
00:02:19,813 --> 00:02:22,181
in the buffer they need to print to next.
45
00:02:23,898 --> 00:02:27,807
They're using the size argument of snprintf
46
00:02:27,849 --> 00:02:30,809
to make sure that they don't write
47
00:02:31,229 --> 00:02:32,689
beyond the end of the buffer.
48
00:02:34,282 --> 00:02:38,444
Because I'd never seen snprintf used
49
00:02:38,475 --> 00:02:40,683
like this before, I pulled up the man page
50
00:02:40,731 --> 00:02:45,363
for snprintf to see how it works.
51
00:02:50,241 --> 00:02:52,487
Now, if we search for RETURN here,
52
00:02:52,525 --> 00:02:54,711
we get the documentation on the return value.
53
00:02:55,532 --> 00:02:56,982
There's a very interesting thing here.
54
00:02:59,052 --> 00:03:01,409
It says that the return value
55
00:03:01,447 --> 00:03:02,695
is the number of characters
56
00:03:02,742 --> 00:03:04,385
which would have been written
57
00:03:04,437 --> 00:03:07,323
to the final string if enough space had been available.
58
00:03:10,686 --> 00:03:13,830
What that means is that this code
59
00:03:13,881 --> 00:03:16,236
that we just saw in curl looks unsafe.
60
00:03:16,959 --> 00:03:21,668
Because if there's a string that is too big,
61
00:03:22,791 --> 00:03:29,451
it exceeds this value, then,
if say the string is 100 bytes
62
00:03:29,497 --> 00:03:34,446
but there's only 10 bytes left in the buffer,
then ptr is still going to get incremented
63
00:03:34,477 --> 00:03:36,164
by 100 bytes
64
00:03:36,731 --> 00:03:40,292
and so on the next iteration of the loop, we can continue,
65
00:03:40,888 --> 00:03:42,419
and we're going to continue printing.
66
00:03:42,608 --> 00:03:52,317
What's worse is that this sizeof subtraction here
67
00:03:52,349 --> 00:03:56,230
is actually going to compute
the answer minus 90,
68
00:03:56,500 --> 00:04:00,731
but the input to snprintf,
the size input to snprintf,
69
00:04:00,761 --> 00:04:02,537
is a size_t, it's an unsigned value.
70
00:04:03,140 --> 00:04:05,724
We're going to get a negative integer overflow here.
71
00:04:07,506 --> 00:04:09,093
That means that there's going to be effectively
72
00:04:09,122 --> 00:04:11,146
no limit on the number of bytes
73
00:04:11,342 --> 00:04:13,270
that we can write to the buffer.
74
00:04:14,108 --> 00:04:17,861
Now, it turns out that
this is not actually a problem in curl,
75
00:04:17,907 --> 00:04:20,720
because they've got
their own implementation of snprintf,
76
00:04:20,755 --> 00:04:21,755
which you can see here.
77
00:04:23,520 --> 00:04:27,530
This version of snprintf works the way
78
00:04:27,530 --> 00:04:31,479
that you need it to in order
to use this coding pattern.
79
00:04:32,453 --> 00:04:34,285
So, it returns the number of bytes
80
00:04:34,316 --> 00:04:36,303
that it actually wrote rather than the number of bytes
81
00:04:36,351 --> 00:04:37,776
that it wanted to write.
82
00:04:38,800 --> 00:04:39,932
There's no problem in curl,
83
00:04:40,066 --> 00:04:42,961
but after having seen this coding pattern,
84
00:04:43,007 --> 00:04:44,007
I thought,
85
00:04:44,333 --> 00:04:45,318
"Well, I wonder if anybody else
86
00:04:45,342 --> 00:04:48,184
has done the same thing
but using the real snprintf.
87
00:04:48,205 --> 00:04:49,205
Because if they have,
88
00:04:49,657 --> 00:04:52,623
then they might have a buffer overflow in their code."
89
00:04:54,869 --> 00:04:58,695
Let's now go to Eclipse and use QL to search
90
00:04:58,742 --> 00:05:00,662
for this vulnerability.
91
00:05:02,677 --> 00:05:04,543
What I'm going to do first is something
92
00:05:04,572 --> 00:05:06,727
that you could do equally well
93
00:05:06,926 --> 00:05:07,952
with grep,
94
00:05:07,987 --> 00:05:11,476
but we're very quickly going to refine
95
00:05:11,517 --> 00:05:12,517
this query to something
96
00:05:12,573 --> 00:05:14,320
that is much more sophisticated
97
00:05:14,373 --> 00:05:15,111
than what you can do
98
00:05:15,135 --> 00:05:16,135
with grep.
99
00:05:16,704 --> 00:05:18,518
To begin with, I'm just going to search
100
00:05:18,551 --> 00:05:22,034
for all calls to snprintf in the project.
101
00:05:32,752 --> 00:05:33,875
108 results.
102
00:05:35,123 --> 00:05:41,282
Now, 108 results is not that many.
103
00:05:41,707 --> 00:05:44,095
You could realistically audit all of these
104
00:05:44,134 --> 00:05:45,199
by hand.
105
00:05:48,111 --> 00:05:51,302
One of our customers actually had
106
00:05:51,350 --> 00:05:57,018
this exact same vulnerability on some proprietary code.
107
00:05:59,587 --> 00:06:02,190
If you were to run this query on their code base,
108
00:06:02,234 --> 00:06:04,692
then you'd get over 600 results.
109
00:06:06,042 --> 00:06:08,568
That's going to be quite tedious to look through
110
00:06:08,614 --> 00:06:09,614
by hand.
111
00:06:10,135 --> 00:06:13,291
QL can really help you, save you some time
112
00:06:13,464 --> 00:06:16,270
by quickly ruling out the cases
113
00:06:16,334 --> 00:06:18,508
that are not interesting to look at.
114
00:06:18,545 --> 00:06:24,296
The most common reason that the call to snprintf
115
00:06:24,343 --> 00:06:26,143
is safe is that you're just not using
116
00:06:26,188 --> 00:06:27,974
the return value of snprintf.
117
00:06:28,565 --> 00:06:32,051
Let's update the query to rule out those cases.
118
00:06:37,864 --> 00:06:42,013
We just want to say that the call is not in a void context.
119
00:06:46,533 --> 00:06:53,463
If we run the query that way,
then we're now down to 45 results.
120
00:06:54,612 --> 00:06:55,803
Now, let's look at one of these.
121
00:06:58,149 --> 00:07:04,102
This is another common reason
that it's unlikely to be a problem,
122
00:07:04,590 --> 00:07:09,918
this format specifier contains only a %d.
123
00:07:11,623 --> 00:07:14,164
Now, a %d is going to be quite difficult
124
00:07:14,199 --> 00:07:18,197
for an attacker to do anything with,
125
00:07:18,956 --> 00:07:23,398
because the number of bytes
126
00:07:23,429 --> 00:07:24,376
that you can generate
127
00:07:24,400 --> 00:07:25,902
with a %d is limited
128
00:07:25,902 --> 00:07:30,317
and also the content of the string
129
00:07:30,349 --> 00:07:31,349
that you can generate
130
00:07:31,373 --> 00:07:33,365
with it is quite limited,
131
00:07:33,415 --> 00:07:36,326
because you can only generate ASCII digits
132
00:07:36,813 --> 00:07:39,693
rather than an arbitrary list of characters.
133
00:07:40,212 --> 00:07:43,934
What an attacker is really
going to want is a %s,
134
00:07:43,934 --> 00:07:47,802
because that allows them to control both
135
00:07:47,900 --> 00:07:49,746
the size of the string that's generated
136
00:07:49,794 --> 00:07:51,343
and also the exact content
137
00:07:51,343 --> 00:07:52,848
of the string that's generated.
138
00:07:53,994 --> 00:07:55,678
Something that I would also like to point out
139
00:07:55,710 --> 00:08:01,583
about this example is that we're now looking
at a call to snprintf that is not all in one line.
140
00:08:01,941 --> 00:08:05,405
That's quickly going to become a pain to search
141
00:08:05,452 --> 00:08:07,033
for with regular expressions.
142
00:08:07,196 --> 00:08:09,855
When you're searching with QL,
143
00:08:10,199 --> 00:08:11,388
that's not a problem at all,
144
00:08:11,439 --> 00:08:13,341
because we're looking
145
00:08:13,387 --> 00:08:15,405
at the abstract syntax tree of the code
146
00:08:15,440 --> 00:08:17,271
rather than just doing text based searching.
147
00:08:19,769 --> 00:08:21,716
What we're going to do here is we're going to add
148
00:08:21,748 --> 00:08:24,715
a condition that the format specifier
149
00:08:24,777 --> 00:08:26,578
should contain a %s.
150
00:08:30,815 --> 00:08:35,082
We use zero-based indexing
for the arguments of a call.
151
00:08:35,131 --> 00:08:38,075
The format specifier is going to be argument number two,
152
00:08:38,602 --> 00:08:40,710
and we can get the string
153
00:08:40,763 --> 00:08:41,530
like this
154
00:08:41,554 --> 00:08:43,820
and then we can do regular expression matching on it
155
00:08:43,911 --> 00:08:44,911
like this.
156
00:08:46,911 --> 00:08:49,762
We want it to contain some stuff followed
157
00:08:49,792 --> 00:08:51,959
by a %s, followed
158
00:08:51,990 --> 00:08:53,023
by some more stuff.
159
00:08:54,836 --> 00:08:55,537
Now, by default,
160
00:08:55,561 --> 00:09:00,942
regular expressions don't match strings
161
00:09:00,942 --> 00:09:02,578
that contain a newline character.
162
00:09:05,687 --> 00:09:07,505
Format strings very frequently
163
00:09:07,553 --> 00:09:09,304
do contain newline characters.
164
00:09:09,751 --> 00:09:14,387
We're going to add this special syntax here.
165
00:09:14,435 --> 00:09:17,973
We're using Java's regular expression syntax.
166
00:09:18,413 --> 00:09:20,345
That's how you tell it to
167
00:09:20,400 --> 00:09:23,277
also match strings that contain newline characters.
168
00:09:24,725 --> 00:09:25,725
Let's run that now.
169
00:09:27,125 --> 00:09:28,863
Now we're down to 18 results.
170
00:09:30,807 --> 00:09:32,197
Let's have a look at some of these.
171
00:09:33,059 --> 00:09:36,781
Now, that one is in fact, the vulnerability.
172
00:09:37,302 --> 00:09:38,616
We'll get back to that later.
173
00:09:38,663 --> 00:09:39,406
Let's first look
174
00:09:39,430 --> 00:09:40,496
at some others to see
175
00:09:42,498 --> 00:09:44,140
which other ones we might want to rule out.
176
00:09:44,167 --> 00:09:47,422
This is an example of a result
177
00:09:47,465 --> 00:09:48,715
that we're not interested in
178
00:09:48,762 --> 00:09:53,146
because here the result of the return value
of snprintf is being used,
179
00:09:53,179 --> 00:09:54,603
but it's being used in a safe way
180
00:09:54,662 --> 00:09:57,065
because we're using it to check that
181
00:09:57,102 --> 00:10:02,068
it didn't want to write
more characters than we expected.
182
00:10:03,325 --> 00:10:09,279
Really, the vulnerability that we're looking
for is the case where you have a loop,
183
00:10:10,125 --> 00:10:15,276
and the output of snprintf is getting fed back
184
00:10:15,316 --> 00:10:17,840
into the size parameter of snprintf.
185
00:10:17,876 --> 00:10:24,831
That's what allows the buffer overflow to happen,
186
00:10:24,880 --> 00:10:28,059
and also the negative integer overflow here,
187
00:10:28,172 --> 00:10:32,182
which is essential in order to have the vulnerability.
188
00:10:34,018 --> 00:10:35,347
What I'm going to do now is
189
00:10:35,403 --> 00:10:37,876
I'm going to use one of our libraries,
190
00:10:37,911 --> 00:10:39,466
I'm going to use the taint tracking library
191
00:10:39,483 --> 00:10:41,700
to see if there's data flow
192
00:10:42,313 --> 00:10:45,658
from the output back into the size argument.
193
00:10:49,160 --> 00:10:51,094
I have to have an import statement first.
194
00:10:56,484 --> 00:11:00,078
I'm going to import dataflow's TaintTracking library.
195
00:11:03,790 --> 00:11:08,162
Then we're going to add a source and a sink.
196
00:11:09,222 --> 00:11:11,531
Normally, when you use the TaintTracking library,
197
00:11:11,832 --> 00:11:13,110
the source would be something
198
00:11:13,175 --> 00:11:15,180
that you believe to be tainted
199
00:11:15,211 --> 00:11:17,804
because it's controllable by an attacker,
200
00:11:18,649 --> 00:11:22,101
and the sink would be a place
201
00:11:22,157 --> 00:11:23,645
where it's used in a dangerous way.
202
00:11:25,241 --> 00:11:26,241
QL is flexible,
203
00:11:26,473 --> 00:11:27,517
and so we can use it
204
00:11:27,896 --> 00:11:31,753
for this kind of situation
as well which is slightly different,
205
00:11:31,829 --> 00:11:32,922
where we're going to just say,
206
00:11:33,483 --> 00:11:36,065
"I want the source to be the call to snprintf,
207
00:11:36,110 --> 00:11:41,137
and I want the sink to be the size argument of snprintf."
208
00:11:42,214 --> 00:11:43,538
It will do that equally well.
209
00:11:48,139 --> 00:11:49,298
"call" is the source,
210
00:11:50,066 --> 00:12:00,507
and the sink is the size argument.
211
00:12:05,733 --> 00:12:06,733
In fact,
212
00:12:06,998 --> 00:12:10,367
because the call is the source of the data flow,
213
00:12:12,068 --> 00:12:13,059
this thing
214
00:12:13,083 --> 00:12:16,273
about it being in a void context is now redundant,
215
00:12:16,303 --> 00:12:18,431
so we can remove that
from the query if we want.
216
00:12:19,629 --> 00:12:21,040
It would be harmless if we left it in.
217
00:12:22,772 --> 00:12:23,632
Now, we need to actually
218
00:12:23,656 --> 00:12:25,618
use the TaintTracking library
219
00:12:25,671 --> 00:12:26,713
to connect up the source
220
00:12:26,762 --> 00:12:27,786
and the sink.
221
00:12:31,730 --> 00:12:35,276
We only need to look for local data flow,
222
00:12:35,325 --> 00:12:38,469
that means within the function.
For this particular pattern,
223
00:12:39,108 --> 00:12:41,790
it's always going to be within a single function.
224
00:12:42,074 --> 00:12:43,909
In general, our TaintTracking library
225
00:12:44,522 --> 00:12:48,845
supports interprocedural taint tracking as well,
226
00:12:49,177 --> 00:12:50,000
we just don't need it here,
227
00:12:50,023 --> 00:12:52,202
so that's why we're using the localTaint function.
228
00:12:56,102 --> 00:12:57,500
Let's run that.
229
00:12:58,193 --> 00:13:04,690
Now we have exactly one result,
which is the vulnerability right here.
230
00:13:05,637 --> 00:13:07,283
Now, there's one final thing
231
00:13:07,283 --> 00:13:08,642
that I'd like to show you.
232
00:13:09,184 --> 00:13:16,074
It's often useful to use range analysis
233
00:13:16,113 --> 00:13:20,330
to see what sizes we're able to deduce.
234
00:13:21,064 --> 00:13:25,794
In this particular case, we're interested
in the size argument into snprintf.
235
00:13:25,794 --> 00:13:29,793
If we were to apply range analysis to it
236
00:13:29,838 --> 00:13:35,206
and find out that it has an upper bound of 10,
then there's probably nothing wrong.
237
00:13:36,580 --> 00:13:39,646
I'm just going to show you
how to use the range analysis library.
238
00:13:40,435 --> 00:13:41,685
We need to import it like this,
239
00:13:46,968 --> 00:13:52,405
and then we can just add
an extra column to the results list.
240
00:13:53,083 --> 00:13:54,253
We're going to include
241
00:13:54,290 --> 00:13:57,858
the upper bound of the size argument.
242
00:14:04,174 --> 00:14:09,012
What I'm also going to do is I'm going to apply
getFullyConverted to the size argument.
243
00:14:12,062 --> 00:14:13,089
Just run that now.