Function calling?

Question

Function calling?

xetrics opened this issue 7 years ago · comments

Nate commented 7 years ago

You should consider adding functionality to calling functions by address. The equivalent in C would be

int ( __cdecl *addValues )( int a, int b ) = ( int ( __cdecl* )( int, int ) )0xDEADBEEF;

Rob commented 6 years ago

Yep.

Rob · Answer 1 · Tue Mar 07 2017 05:44:36 GMT+0800 (China Standard Time)

I'm not a C++ programmer so I'll have to read up about this, if you could help at all that would be great. The aim of this implementation would be something similar to:

var args = [{
  type: memoryjs.INT,
  value: 5
}, {
  type: memoryjs.INT,
  value: 10
}]
var returnType = memoryjs.INT
var offset = 0xDEADBEEF // offset of `addValues`

memoryjs.executeFunction(args, returnType, offset, (err, result) => {
  console.log(result) // 15
})

Something dynamic like this would be a lot better than hard coding a huge array of different function combinations. Do you know if something like this is possible?

Also if *addValues is the function pointer, would you simply call it by doing addValues(4, 5)?

Nate · Answer 2 · Tue Mar 07 2017 06:50:11 GMT+0800 (China Standard Time)

Yes, here is a helpful article I feel could help https://www.unknowncheats.me/wiki/Calling_Functions_From_Injected_Library_Using_Function_Pointers_in_C%2B%2B

I am a c++ programmer but not very experienced with node-gyp, I could help with some of the C if necessary.

Rob · Answer 3 · Tue Mar 07 2017 07:33:11 GMT+0800 (China Standard Time)

I can handle the JS implementation, looking at the example code above I posted, do you know how to recreate your code snippet without hard coding the parameters of the function? How can you recreate the function pointer with an unknown number of parameters each with a given data type?

Nate · Answer 4 · Tue Mar 07 2017 13:30:54 GMT+0800 (China Standard Time)

varargs and templates (looking for a more elaborate example)

Nate · Answer 5 · Tue Mar 07 2017 13:37:11 GMT+0800 (China Standard Time)

OH, I completely forgot that we are not working with an injected library here (where I usually work from)

We need to do something along the lines of https://msdn.microsoft.com/en-us/library/windows/desktop/ms682437(v=vs.85).aspx

What would need to happen is a stub would need to be code caved into the process that accepts a pointer to multiple arguments, as CRT only supports passing one argument to the thread. Not quite sure if this is a one man job. If you create a PR I would be sure to help.

This may seem like tons of work, but I do believe it will take your module to the next level with its capabilities.

I will write a sub in asm that you can WPM when I get home tomorrow.

Rob · Answer 6 · Wed Mar 08 2017 07:19:02 GMT+0800 (China Standard Time)

Here is semi pseudo code for the C++ code cave implementation:

typedef int(__stdcall *__addValues)(INT, INT);
class CaveParams {
  int a = 3, a = 5; // function parameters
  DWORD addValues; // function address in memory
  // how do we build this class dynamically?
}

DWORD __stdcall RemoteThread(CaveParams *cp){
  __addValues addFunc = (__addValues) cp -> addValues;
  addFunc(cp -> a, cp -> b);
  // type definition for the addValues function is still required for it to be called?
}

int main(){
  LPVOID eRemoteThread= VirtualAllocEx(hProcess, NULL, sizeof(CaveParams), MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
  WriteProcessMemory(hProcess, remoteThread, (LPVOID) eRemoteThread, sizeof(CaveParams), 0);
  CaveParams* params = (CaveParams*) VirtualAlloxEx(hProcess, NULL, sizeof(CaveParams), MEM_COMMIT, PAGE_READWRITE);
  CaveParams caveParams;
  WriteProcessMemory(hProcess, params, &caveParams, sizeof(CaveParams), NULL);
  HANDLE hRemoteThread = CreateRemoteThread(hProcess, 0, 0, (LPTHREAD_START_ROUTINE), eRemoteThread, params, 0, 0);
}

Just looked a bit into code caving and rewrote some sample code to point out 2 issues I still don't understand:

Inside the RemoteThread function that's being run inside the process, I still don't see how it's possible to call the function without knowing before hand the exact number of parameters. Even if this is possible, the CaveParams class was hard coded beforehand to know the number of parameters and their data type?
The function addValues still needs to be defined before it is called inside of the RemoteThread. Maybe I'm still missing something?

From what I understand so far:

Use CreateRemoteThread to run a function we want inside of the process
CreateRemoteThread only accepts one parameter, so create our own function to run inside of the process that accepts a pointer to parameters
Inside of the function we create, call the processes' internal function (get function arguments and address from the RemoteThread pointer parameter)

Nate · Answer 7 · Wed Mar 08 2017 11:09:05 GMT+0800 (China Standard Time)

We cannot have a predefined stub function - one must be generated due to parameters. Here is an example.

Say we wanted to call addValues(1, 5). Here is what the stub would look like:

push 5
push 1
call 0xFF

That would then need to be converted to the opcode for each instruction, so we can WPM it
\x9C\x05\x9C\x01\xE8\xFF

Then we can CRT that function.
(If the function only has a single parameter we can just call CRT with that 1 parameter)

Also, if we were to have to push strings we would need to VirtualAllocEx them and then push the pointer

Rob · Answer 8 · Wed Mar 08 2017 18:49:27 GMT+0800 (China Standard Time)

Had a quick look and it seems the easiest way to convert from ASM to opcode is to build up the ASM in an array (loop over each parameter and add the push instruction, then add the call instruction last) and then loop over this array and build up a new bytes array that can be written to memory. Sounds easy but it's probably a lot harder than I'm thinking

Nate · Answer 9 · Thu Mar 09 2017 08:51:21 GMT+0800 (China Standard Time)

Here is a functioning library written by my friend, he uses it in some exploits of his.
https://gist.github.com/3dsboy08/d9ae62a4ef677df870647ba8297dcfd1

Rob · Answer 10 · Fri Mar 10 2017 05:28:32 GMT+0800 (China Standard Time)

Can't get any code cave example to work. I've tried your friend's plus other sample code online with the calculator example (getting a message box to display on calc.exe) but it causes the calculator to crash whenever the remote thread is created

Alex · Answer 11 · Sat Apr 15 2017 00:13:57 GMT+0800 (China Standard Time)

Is this still being implemented? I need this functionality for my project.

Rob · Answer 12 · Sat Apr 15 2017 00:20:57 GMT+0800 (China Standard Time)

I've been busy with other projects and currently have no plan to look into this again yet, sorry! Pull requests are welcome if you want to try implementing it yourself.

x052 · Answer 13 · Sun Jun 11 2017 04:21:49 GMT+0800 (China Standard Time)

Also need this functionality for a project, if you find some time could you take another look at it?

SalameDefumado · Answer 14 · Thu Feb 22 2018 06:43:19 GMT+0800 (China Standard Time)

@Rob-- when u have some time please watch this video if it helps you
https://www.youtube.com/watch?v=0NwlWaT9NEY

Rob · Answer 15 · Thu Feb 22 2018 09:23:48 GMT+0800 (China Standard Time)

@SalameDefumado Watched the video but I can't think of a way to implement this.

We need to define a structure that contains the parameters and address of the function. This can't be done dynamically due to the way C++ is compiled. The solution to this is that users will need to edit the CPP file and add their own structures manually for each function they want to call. They will also need to create and define their own routines that will be injected into the target process.

What I can implement is the allocation of memory and CRT. However it will not be possible to implement something that allows the user to call a function completely from JS space without editing the CPP files and recompiling.

Nate · Answer 16 · Thu Feb 22 2018 16:49:14 GMT+0800 (China Standard Time)

I know this is old but my solution is still possible. Building a universal codecave dynamically based on the parameters and calling it via CRT. Should work fine with integers but some issues arise when you need to push strings.

Rob · Answer 17 · Thu Feb 22 2018 20:32:45 GMT+0800 (China Standard Time)

@xetrics so it's possible to directly WPM instruction bytes?

Okay so this means that the CaveData struct can be passed from JS to C++ and the lib will iterate over the structure to produce the byte array of opcode/operand?

For the ASM, how would we call the function that we have the address of? Previously you stated call 0xFF, can you expand on this and explain please?

Edit: neither the video's example nor your friend's example works for me, in both cases the target application crashes. ZeroMemory's example also doesn't make sense, he is assuming the size of the remote thread is the size of CaveData, so I tried using your friend's GetFunctionSize function and I get an access violation reading memory error at if (*(BYTE*)(Func + Size) == 0xF4).

Nate · Answer 18 · Fri Feb 23 2018 00:27:28 GMT+0800 (China Standard Time)

@Rob--

This is a pretty messy area of reverse engineering, accurately detecting the size of functions as all compilers are not the same, and some programs even add random padding in-between stubs to make this even harder. The function is just looping over the entire function, and adds to the size until it sees a ret instruction (0xF4). This can get sloppy at times, but it is the best solution when you are reverse engineering.

To answer your access violation, try to make sure that the page is PAGE_READ using VirtualQuery.

To call the function using call, you would need to set the operand as ptr ds:[address of function]

Here is a useful site http://ref.x86asm.net/coder32.html

NOTE 90% of the stuff we are doing only works in 32 bit.

Rob · Answer 19 · Fri Feb 23 2018 01:37:36 GMT+0800 (China Standard Time)

I've tried a few different ways to get the size of the function:

Using your friend's function
RPM a byte at a time and comparing it to 0xF4
Adding a fake function after the definition of the routine to get the difference between the base addresses

I created an empty console project and used that as the target process. The target process and the injecting process were both 32 bit. In all cases the target process crashes, it would be incredibly helpful if you could provide a complete working example project.

I'm not too experienced with ASM but according to your link 0xF4 is HLT where as 0xC2 and 0xC3 are RET. I've tried all the stated methods with 0xF4 and 0xC3 but nothing is working for me so far.

Nate · Answer 20 · Wed Feb 28 2018 23:51:18 GMT+0800 (China Standard Time)

Sorry I looked at your reply in school and forgot to make an example when I got home, i'll try and reply tonight.

Rob · Answer 21 · Thu Mar 01 2018 23:18:06 GMT+0800 (China Standard Time)

Sure thing my dude. When you make the example can you please tell me what OS and what OS version you ran it on?

Nate · Answer 22 · Sat Mar 03 2018 13:08:01 GMT+0800 (China Standard Time)

Currently working on a way to do them dynamically, it's kind of a headache but here is how you call functions externally with static stuff (sorry for messy code, wrote this up in like 15 minutes):

	PROCESSENTRY32 pe = { sizeof(PROCESSENTRY32) };
	auto snapshot = CreateToolhelp32Snapshot(TH32CS_SNAPPROCESS, NULL);
	int pid;
	if (Process32First(snapshot, &pe)) {
		while (Process32Next(snapshot, &pe)) {
			if (!strcmp(pe.szExeFile, "target.exe")) {
				pid = pe.th32ProcessID;
			}
		}
	}
	auto proc = OpenProcess(PROCESS_ALL_ACCESS, FALSE, pid);
	/*
	below shellcode in asm:
	push 1
	push 1
	call 0x000000    ;   placeholder for the relative address to be calcuated
	add esp, 8       ;   amt of args * 4
	ret
	*/

	BYTE rgShellcode[] = {0x6A, 0x01, 0x6A, 0x01, 0xE8, 0x0, 0x0, 0x0, 0x0, 0x83, 0xC4, 0x08, 0xC3};
	void* pShellcode = VirtualAllocEx(proc, NULL, sizeof(rgShellcode), MEM_RESERVE | MEM_COMMIT, PAGE_EXECUTE_READWRITE); // allocate the shellcode in the target process
	*(uintptr_t*)(rgShellcode + 5) = (uintptr_t)ADDRESS - (uintptr_t)pShellcode - 9; // using the position of the newly allocated memory and target function, calculate the call address relative to the new memory
	WriteProcessMemory(proc, pShellcode, rgShellcode, sizeof(rgShellcode), NULL); // actually write the shellcode to the new memory
	CreateRemoteThread(proc, NULL, NULL, (LPTHREAD_START_ROUTINE)pShellcode, NULL, NULL, NULL); // call our stub

Be sure to disable ASLR in your target process. (we will need to compensate for this eventually)

Rob · Answer 23 · Sun Mar 04 2018 08:20:23 GMT+0800 (China Standard Time)

Okay yeah this code works for me, how do we go about pushing anything other than an int to the stack?

Nate · Answer 24 · Sun Mar 04 2018 08:44:47 GMT+0800 (China Standard Time)

It all comes down to knowledge of datatypes in memory. For example for strings, you would need to alloc/wpm it, then push the address absolute/relative (not sure on this one). For bools, you can just push a 0x01 or a 0x02. I'm having issues with a weird paradox right now on the dynamic building of the push instructions. If you want to take a look, here it is (ignore all the ugly code and debug stuff):

enum ARGTYPE {T_INT, T_NONE};
struct arg {
	ARGTYPE type;
	void* value;
};

void fnCallRoutine(HANDLE proc, std::vector<arg> pargs, ARGTYPE preturn, void* address) {
	unsigned char nparams_c = (unsigned char)pargs.size();
	std::vector<unsigned char> array_queue = { 0xE8, 0x0, 0x0, 0x0, 0x0, 0x83, 0xC4, nparams_c, 0xC3 };

	for (auto i = 0; i < pargs.size(); i++) {
		
		switch (pargs[i].type) {
		case T_INT:
			int param = (int)pargs[i].value;
			array_queue.insert(array_queue.begin(), (param >> 24) & 0xFF);
			array_queue.insert(array_queue.begin(), (param >> 16) & 0xFF);
			array_queue.insert(array_queue.begin(), (param >> 8) & 0xFF);
			array_queue.insert(array_queue.begin(), param & 0xFF);
			array_queue.insert(array_queue.begin(), 0x68); 
			break;
		}
	}

	unsigned char* rgShellcode = array_queue.data();
	printf("0x%02x, 0x%02x, 0x%02x, 0x%02x, 0x%02x, 0x%02x, 0x%02x, 0x%02x, 0x%02x", rgShellcode[0], rgShellcode[1], rgShellcode[2], rgShellcode[3], rgShellcode[4], rgShellcode[5], rgShellcode[6], rgShellcode[7], rgShellcode[8]);
	getchar();
	void* pShellcode = VirtualAllocEx(proc, NULL, sizeof(rgShellcode), MEM_RESERVE | MEM_COMMIT, PAGE_EXECUTE_READWRITE);
	*(uintptr_t*)(rgShellcode + 5) = (uintptr_t)address - (uintptr_t)rgShellcode - 9;
	WriteProcessMemory(proc, pShellcode, rgShellcode, sizeof(rgShellcode), NULL);
	//CreateRemoteThread(proc, NULL, NULL, (LPTHREAD_START_ROUTINE)pShellcode, NULL, NULL, NULL);
}

The int case seems to just be overwriting instead of inserting, or it's pushing the stuff out of the vector leftwards (hard to explain). This might be an option but I haven't had time to look into it (not FIFO like vectors): http://www.cplusplus.com/reference/deque/deque/

Nate · Answer 25 · Sun Mar 04 2018 09:29:18 GMT+0800 (China Standard Time)

Isolated the issue: https://repl.it/repls/BrokenWorstSequences

The vector past 9 is no longer in contagious memory(?)

Rob · Answer 26 · Sun Mar 04 2018 09:53:49 GMT+0800 (China Standard Time)

Why not just build the entire opcode array dynamically instead of trying to insert the arguments after?

And I'm confused, when you're dynamically pushing the arguments what is all the bitshifting and bitwise AND for? Should it not require just 2 chars (push instruction and the value)?

Also when I try and write a value such as 69 (0x45), for example:

push 0x45
push 0x45
call 0xFF [add]

it will act as this: add(45, 45) instead of add(69, 69). Any idea as to what is happening here? I am unfamiliar with ASM.

void call(HANDLE pHandle, std::vector<Arg> args, TYPE returnType, DWORD64 address) {
  std::vector<unsigned char> argShellcode;

  for (auto &arg : args) {
    argShellcode.push_back(0x6A);

    if (arg.type == T_INT) {
      argShellcode.push_back(*static_cast<int*>(arg.value));
    }
  }

  std::vector<unsigned char> callShellcode = {
    0xE8, 0x00, 0x00, 0x00, 0x00, // call 0x00000000
    0x83, 0xC4, (unsigned char) (args.size() * 0x4), // add esp, [arg count * 4]
    0xC3, // return
    '\0', // null terminator to get the size of the shellcode
  };

  // concatenate the arg shellcode with the calling shellcode
  std::vector<unsigned char> shellcode;
  shellcode.reserve(argShellcode.size() + callShellcode.size());
  shellcode.insert(shellcode.end(), argShellcode.begin(), argShellcode.end());
  shellcode.insert(shellcode.end(), callShellcode.begin(), callShellcode.end());

  unsigned char* rgShellcode = shellcode.data();

  for (int i = 0; i < shellcode.size(); i++) {
    printf("0x%02x\n", rgShellcode[i]);
  }

  SIZE_T size = shellcode.size() * sizeof(char);

  void* pShellcode = VirtualAllocEx(pHandle, NULL, size, MEM_RESERVE | MEM_COMMIT, PAGE_EXECUTE_READWRITE); // allocate the shellcode in the target process
  *(uintptr_t*)(rgShellcode + 5) = address - (uintptr_t)pShellcode - 9; // using the position of the newly allocated memory and target function, calculate the call address relative to the new memory
  WriteProcessMemory(pHandle, pShellcode, rgShellcode, size, NULL); // actually write the shellcode to the new memory
  CreateRemoteThread(pHandle, NULL, NULL, (LPTHREAD_START_ROUTINE)pShellcode, NULL, NULL, NULL); // call our stub
}

int main()
{
  PROCESSENTRY32 process = findProcess("Test Project.exe");
  MODULEENTRY32 module = findModule("Test Project.exe", process.th32ProcessID);
	
  HANDLE pHandle = OpenProcess(PROCESS_ALL_ACCESS, FALSE, process.th32ProcessID);
  DWORD functionAddress = (DWORD)module.modBaseAddr + 0x1030;

  int a = 2;

  std::vector<Arg> args = {
    { T_INT, &a },
    { T_INT, &a },
  };

  call(pHandle, args, T_INT, functionAddress);
  return 0;
}

Nate · Answer 27 · Sun Mar 04 2018 10:37:28 GMT+0800 (China Standard Time)

Do you have discord? This will be a lot easier than through github.

Nate · Answer 28 · Sun Mar 04 2018 10:41:55 GMT+0800 (China Standard Time)

chrome#3110, add me back

Rob · Answer 29 · Sat Mar 10 2018 09:24:05 GMT+0800 (China Standard Time)

I've created a repo with a working demonstration in case any one wants to further contribute. Still struggling with writing strings though, but everything else seems to be working so far.

SalameDefumado · Answer 30 · Mon May 14 2018 03:29:59 GMT+0800 (China Standard Time)

@Rob-- do you plan to push that to the lib?

Rob · Answer 31 · Sat May 26 2018 00:54:50 GMT+0800 (China Standard Time)

Yes at some point after my exams, will also add look into adding other features.

Rodriguinho1 · Answer 32 · Sat Dec 08 2018 05:47:19 GMT+0800 (China Standard Time)

waiting :(

Liam Mitchell · Answer 33 · Sun Dec 16 2018 09:14:49 GMT+0800 (China Standard Time)

I think it's a pretty complicated thing, if your going to have different return types and inputs to the functions and handle all the calling conventions.
And running on a different thread is not ideal you can run into race conditions.
As well as having buffers large enough for outparams, passing arrays, lists or other structures?

Perhaps out of scope of memoryjs?

Suggest to inject a DLL to do more advanced things than reading/writing but thats just my 2 cents.

Rob · Answer 34 · Mon Dec 17 2018 05:35:31 GMT+0800 (China Standard Time)

I've nearly implemented this into the library, it can remotely call the function perfectly, but for some reason the target application crashes (complains about illegal instructions) even though the shell code generated by my dummy project is identical to the shellcode generated by the library.

Rob · Answer 35 · Tue Dec 18 2018 02:28:55 GMT+0800 (China Standard Time)

Just pushed function execution in 5538702, documentation has also been updated. To test this, you can run this C++ program, copy the address of the function that is printed in the console and then update this JS file and run it.

You can play around with the testAdd function in the C++ program. Just ensure the return type and the arguments of testAdd (or whatever function you use) match the return type and arguments supplied by the JS file.

Nate · Answer 36 · Thu Jan 31 2019 10:11:43 GMT+0800 (China Standard Time)

Just saw this. Great job, this is impressive.

Rodriguinho1 · Answer 37 · Sat Mar 09 2019 06:58:32 GMT+0800 (China Standard Time)

When will it be available in npm?

Rob · Answer 38 · Sun Mar 31 2019 08:01:45 GMT+0800 (China Standard Time)

@Rodriguinho1 available now.

Omar Minaya · Answer 39 · Wed Feb 10 2021 15:32:59 GMT+0800 (China Standard Time)

could we get some more documentation on this? i haven’t had any luck executing functions or calling them at all