Rob-- / memoryjs

Read and write process memory in Node.js (Windows API functions exposed via Node bindings)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Function calling?

xetrics opened this issue · comments

commented

You should consider adding functionality to calling functions by address. The equivalent in C would be

int ( __cdecl *addValues )( int a, int b ) = ( int ( __cdecl* )( int, int ) )0xDEADBEEF;

commented

I'm not a C++ programmer so I'll have to read up about this, if you could help at all that would be great. The aim of this implementation would be something similar to:

var args = [{
  type: memoryjs.INT,
  value: 5
}, {
  type: memoryjs.INT,
  value: 10
}]
var returnType = memoryjs.INT
var offset = 0xDEADBEEF // offset of `addValues`

memoryjs.executeFunction(args, returnType, offset, (err, result) => {
  console.log(result) // 15
})

Something dynamic like this would be a lot better than hard coding a huge array of different function combinations. Do you know if something like this is possible?

Also if *addValues is the function pointer, would you simply call it by doing addValues(4, 5)?

commented

Yes, here is a helpful article I feel could help https://www.unknowncheats.me/wiki/Calling_Functions_From_Injected_Library_Using_Function_Pointers_in_C%2B%2B

I am a c++ programmer but not very experienced with node-gyp, I could help with some of the C if necessary.

commented

I can handle the JS implementation, looking at the example code above I posted, do you know how to recreate your code snippet without hard coding the parameters of the function? How can you recreate the function pointer with an unknown number of parameters each with a given data type?

commented

varargs and templates (looking for a more elaborate example)

commented

OH, I completely forgot that we are not working with an injected library here (where I usually work from)

We need to do something along the lines of https://msdn.microsoft.com/en-us/library/windows/desktop/ms682437(v=vs.85).aspx

What would need to happen is a stub would need to be code caved into the process that accepts a pointer to multiple arguments, as CRT only supports passing one argument to the thread. Not quite sure if this is a one man job. If you create a PR I would be sure to help.

This may seem like tons of work, but I do believe it will take your module to the next level with its capabilities.

I will write a sub in asm that you can WPM when I get home tomorrow.

commented

Here is semi pseudo code for the C++ code cave implementation:

typedef int(__stdcall *__addValues)(INT, INT);
class CaveParams {
  int a = 3, a = 5; // function parameters
  DWORD addValues; // function address in memory
  // how do we build this class dynamically?
}

DWORD __stdcall RemoteThread(CaveParams *cp){
  __addValues addFunc = (__addValues) cp -> addValues;
  addFunc(cp -> a, cp -> b);
  // type definition for the addValues function is still required for it to be called?
}

int main(){
  LPVOID eRemoteThread= VirtualAllocEx(hProcess, NULL, sizeof(CaveParams), MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
  WriteProcessMemory(hProcess, remoteThread, (LPVOID) eRemoteThread, sizeof(CaveParams), 0);
  CaveParams* params = (CaveParams*) VirtualAlloxEx(hProcess, NULL, sizeof(CaveParams), MEM_COMMIT, PAGE_READWRITE);
  CaveParams caveParams;
  WriteProcessMemory(hProcess, params, &caveParams, sizeof(CaveParams), NULL);
  HANDLE hRemoteThread = CreateRemoteThread(hProcess, 0, 0, (LPTHREAD_START_ROUTINE), eRemoteThread, params, 0, 0);
}

Just looked a bit into code caving and rewrote some sample code to point out 2 issues I still don't understand:

  1. Inside the RemoteThread function that's being run inside the process, I still don't see how it's possible to call the function without knowing before hand the exact number of parameters. Even if this is possible, the CaveParams class was hard coded beforehand to know the number of parameters and their data type?

  2. The function addValues still needs to be defined before it is called inside of the RemoteThread. Maybe I'm still missing something?


From what I understand so far:

  • Use CreateRemoteThread to run a function we want inside of the process
  • CreateRemoteThread only accepts one parameter, so create our own function to run inside of the process that accepts a pointer to parameters
  • Inside of the function we create, call the processes' internal function (get function arguments and address from the RemoteThread pointer parameter)
commented

We cannot have a predefined stub function - one must be generated due to parameters. Here is an example.

Say we wanted to call addValues(1, 5). Here is what the stub would look like:

push 5
push 1
call 0xFF

That would then need to be converted to the opcode for each instruction, so we can WPM it
\x9C\x05\x9C\x01\xE8\xFF

Then we can CRT that function.
(If the function only has a single parameter we can just call CRT with that 1 parameter)

Also, if we were to have to push strings we would need to VirtualAllocEx them and then push the pointer

commented

Had a quick look and it seems the easiest way to convert from ASM to opcode is to build up the ASM in an array (loop over each parameter and add the push instruction, then add the call instruction last) and then loop over this array and build up a new bytes array that can be written to memory. Sounds easy but it's probably a lot harder than I'm thinking

commented

Here is a functioning library written by my friend, he uses it in some exploits of his.
https://gist.github.com/3dsboy08/d9ae62a4ef677df870647ba8297dcfd1

commented

Can't get any code cave example to work. I've tried your friend's plus other sample code online with the calculator example (getting a message box to display on calc.exe) but it causes the calculator to crash whenever the remote thread is created

commented

Is this still being implemented? I need this functionality for my project.

commented

I've been busy with other projects and currently have no plan to look into this again yet, sorry! Pull requests are welcome if you want to try implementing it yourself.

commented

Also need this functionality for a project, if you find some time could you take another look at it?

@Rob-- when u have some time please watch this video if it helps you
https://www.youtube.com/watch?v=0NwlWaT9NEY

commented

@SalameDefumado Watched the video but I can't think of a way to implement this.

We need to define a structure that contains the parameters and address of the function. This can't be done dynamically due to the way C++ is compiled. The solution to this is that users will need to edit the CPP file and add their own structures manually for each function they want to call. They will also need to create and define their own routines that will be injected into the target process.

What I can implement is the allocation of memory and CRT. However it will not be possible to implement something that allows the user to call a function completely from JS space without editing the CPP files and recompiling.

commented

I know this is old but my solution is still possible. Building a universal codecave dynamically based on the parameters and calling it via CRT. Should work fine with integers but some issues arise when you need to push strings.

commented

@xetrics so it's possible to directly WPM instruction bytes?

Okay so this means that the CaveData struct can be passed from JS to C++ and the lib will iterate over the structure to produce the byte array of opcode/operand?

For the ASM, how would we call the function that we have the address of? Previously you stated call 0xFF, can you expand on this and explain please?

Edit: neither the video's example nor your friend's example works for me, in both cases the target application crashes. ZeroMemory's example also doesn't make sense, he is assuming the size of the remote thread is the size of CaveData, so I tried using your friend's GetFunctionSize function and I get an access violation reading memory error at if (*(BYTE*)(Func + Size) == 0xF4).

commented

@Rob--

This is a pretty messy area of reverse engineering, accurately detecting the size of functions as all compilers are not the same, and some programs even add random padding in-between stubs to make this even harder. The function is just looping over the entire function, and adds to the size until it sees a ret instruction (0xF4). This can get sloppy at times, but it is the best solution when you are reverse engineering.

To answer your access violation, try to make sure that the page is PAGE_READ using VirtualQuery.

To call the function using call, you would need to set the operand as ptr ds:[address of function]

Here is a useful site http://ref.x86asm.net/coder32.html

NOTE 90% of the stuff we are doing only works in 32 bit.

commented

I've tried a few different ways to get the size of the function:

  • Using your friend's function
  • RPM a byte at a time and comparing it to 0xF4
  • Adding a fake function after the definition of the routine to get the difference between the base addresses

I created an empty console project and used that as the target process. The target process and the injecting process were both 32 bit. In all cases the target process crashes, it would be incredibly helpful if you could provide a complete working example project.

I'm not too experienced with ASM but according to your link 0xF4 is HLT where as 0xC2 and 0xC3 are RET. I've tried all the stated methods with 0xF4 and 0xC3 but nothing is working for me so far.

commented

Sorry I looked at your reply in school and forgot to make an example when I got home, i'll try and reply tonight.

commented

Sure thing my dude. When you make the example can you please tell me what OS and what OS version you ran it on?

commented

Currently working on a way to do them dynamically, it's kind of a headache but here is how you call functions externally with static stuff (sorry for messy code, wrote this up in like 15 minutes):

	PROCESSENTRY32 pe = { sizeof(PROCESSENTRY32) };
	auto snapshot = CreateToolhelp32Snapshot(TH32CS_SNAPPROCESS, NULL);
	int pid;
	if (Process32First(snapshot, &pe)) {
		while (Process32Next(snapshot, &pe)) {
			if (!strcmp(pe.szExeFile, "target.exe")) {
				pid = pe.th32ProcessID;
			}
		}
	}
	auto proc = OpenProcess(PROCESS_ALL_ACCESS, FALSE, pid);
	/*
	below shellcode in asm:
	push 1
	push 1
	call 0x000000    ;   placeholder for the relative address to be calcuated
	add esp, 8       ;   amt of args * 4
	ret
	*/

	BYTE rgShellcode[] = {0x6A, 0x01, 0x6A, 0x01, 0xE8, 0x0, 0x0, 0x0, 0x0, 0x83, 0xC4, 0x08, 0xC3};
	void* pShellcode = VirtualAllocEx(proc, NULL, sizeof(rgShellcode), MEM_RESERVE | MEM_COMMIT, PAGE_EXECUTE_READWRITE); // allocate the shellcode in the target process
	*(uintptr_t*)(rgShellcode + 5) = (uintptr_t)ADDRESS - (uintptr_t)pShellcode - 9; // using the position of the newly allocated memory and target function, calculate the call address relative to the new memory
	WriteProcessMemory(proc, pShellcode, rgShellcode, sizeof(rgShellcode), NULL); // actually write the shellcode to the new memory
	CreateRemoteThread(proc, NULL, NULL, (LPTHREAD_START_ROUTINE)pShellcode, NULL, NULL, NULL); // call our stub

Be sure to disable ASLR in your target process. (we will need to compensate for this eventually)

commented

Okay yeah this code works for me, how do we go about pushing anything other than an int to the stack?

commented

It all comes down to knowledge of datatypes in memory. For example for strings, you would need to alloc/wpm it, then push the address absolute/relative (not sure on this one). For bools, you can just push a 0x01 or a 0x02. I'm having issues with a weird paradox right now on the dynamic building of the push instructions. If you want to take a look, here it is (ignore all the ugly code and debug stuff):

enum ARGTYPE {T_INT, T_NONE};
struct arg {
	ARGTYPE type;
	void* value;
};

void fnCallRoutine(HANDLE proc, std::vector<arg> pargs, ARGTYPE preturn, void* address) {
	unsigned char nparams_c = (unsigned char)pargs.size();
	std::vector<unsigned char> array_queue = { 0xE8, 0x0, 0x0, 0x0, 0x0, 0x83, 0xC4, nparams_c, 0xC3 };

	for (auto i = 0; i < pargs.size(); i++) {
		
		switch (pargs[i].type) {
		case T_INT:
			int param = (int)pargs[i].value;
			array_queue.insert(array_queue.begin(), (param >> 24) & 0xFF);
			array_queue.insert(array_queue.begin(), (param >> 16) & 0xFF);
			array_queue.insert(array_queue.begin(), (param >> 8) & 0xFF);
			array_queue.insert(array_queue.begin(), param & 0xFF);
			array_queue.insert(array_queue.begin(), 0x68); 
			break;
		}
	}

	unsigned char* rgShellcode = array_queue.data();
	printf("0x%02x, 0x%02x, 0x%02x, 0x%02x, 0x%02x, 0x%02x, 0x%02x, 0x%02x, 0x%02x", rgShellcode[0], rgShellcode[1], rgShellcode[2], rgShellcode[3], rgShellcode[4], rgShellcode[5], rgShellcode[6], rgShellcode[7], rgShellcode[8]);
	getchar();
	void* pShellcode = VirtualAllocEx(proc, NULL, sizeof(rgShellcode), MEM_RESERVE | MEM_COMMIT, PAGE_EXECUTE_READWRITE);
	*(uintptr_t*)(rgShellcode + 5) = (uintptr_t)address - (uintptr_t)rgShellcode - 9;
	WriteProcessMemory(proc, pShellcode, rgShellcode, sizeof(rgShellcode), NULL);
	//CreateRemoteThread(proc, NULL, NULL, (LPTHREAD_START_ROUTINE)pShellcode, NULL, NULL, NULL);
}

The int case seems to just be overwriting instead of inserting, or it's pushing the stuff out of the vector leftwards (hard to explain). This might be an option but I haven't had time to look into it (not FIFO like vectors): http://www.cplusplus.com/reference/deque/deque/

commented

Isolated the issue: https://repl.it/repls/BrokenWorstSequences

The vector past 9 is no longer in contagious memory(?)

commented

Why not just build the entire opcode array dynamically instead of trying to insert the arguments after?

And I'm confused, when you're dynamically pushing the arguments what is all the bitshifting and bitwise AND for? Should it not require just 2 chars (push instruction and the value)?

Also when I try and write a value such as 69 (0x45), for example:

push 0x45
push 0x45
call 0xFF [add]

it will act as this: add(45, 45) instead of add(69, 69). Any idea as to what is happening here? I am unfamiliar with ASM.

void call(HANDLE pHandle, std::vector<Arg> args, TYPE returnType, DWORD64 address) {
  std::vector<unsigned char> argShellcode;

  for (auto &arg : args) {
    argShellcode.push_back(0x6A);

    if (arg.type == T_INT) {
      argShellcode.push_back(*static_cast<int*>(arg.value));
    }
  }

  std::vector<unsigned char> callShellcode = {
    0xE8, 0x00, 0x00, 0x00, 0x00, // call 0x00000000
    0x83, 0xC4, (unsigned char) (args.size() * 0x4), // add esp, [arg count * 4]
    0xC3, // return
    '\0', // null terminator to get the size of the shellcode
  };

  // concatenate the arg shellcode with the calling shellcode
  std::vector<unsigned char> shellcode;
  shellcode.reserve(argShellcode.size() + callShellcode.size());
  shellcode.insert(shellcode.end(), argShellcode.begin(), argShellcode.end());
  shellcode.insert(shellcode.end(), callShellcode.begin(), callShellcode.end());

  unsigned char* rgShellcode = shellcode.data();

  for (int i = 0; i < shellcode.size(); i++) {
    printf("0x%02x\n", rgShellcode[i]);
  }

  SIZE_T size = shellcode.size() * sizeof(char);

  void* pShellcode = VirtualAllocEx(pHandle, NULL, size, MEM_RESERVE | MEM_COMMIT, PAGE_EXECUTE_READWRITE); // allocate the shellcode in the target process
  *(uintptr_t*)(rgShellcode + 5) = address - (uintptr_t)pShellcode - 9; // using the position of the newly allocated memory and target function, calculate the call address relative to the new memory
  WriteProcessMemory(pHandle, pShellcode, rgShellcode, size, NULL); // actually write the shellcode to the new memory
  CreateRemoteThread(pHandle, NULL, NULL, (LPTHREAD_START_ROUTINE)pShellcode, NULL, NULL, NULL); // call our stub
}

int main()
{
  PROCESSENTRY32 process = findProcess("Test Project.exe");
  MODULEENTRY32 module = findModule("Test Project.exe", process.th32ProcessID);
	
  HANDLE pHandle = OpenProcess(PROCESS_ALL_ACCESS, FALSE, process.th32ProcessID);
  DWORD functionAddress = (DWORD)module.modBaseAddr + 0x1030;

  int a = 2;

  std::vector<Arg> args = {
    { T_INT, &a },
    { T_INT, &a },
  };

  call(pHandle, args, T_INT, functionAddress);
  return 0;
}
commented

Do you have discord? This will be a lot easier than through github.

commented

Yep.

commented

chrome#3110, add me back

commented

I've created a repo with a working demonstration in case any one wants to further contribute. Still struggling with writing strings though, but everything else seems to be working so far.

@Rob-- do you plan to push that to the lib?

commented

Yes at some point after my exams, will also add look into adding other features.

waiting :(

I think it's a pretty complicated thing, if your going to have different return types and inputs to the functions and handle all the calling conventions.
And running on a different thread is not ideal you can run into race conditions.
As well as having buffers large enough for outparams, passing arrays, lists or other structures?

Perhaps out of scope of memoryjs?

Suggest to inject a DLL to do more advanced things than reading/writing but thats just my 2 cents.

commented

I've nearly implemented this into the library, it can remotely call the function perfectly, but for some reason the target application crashes (complains about illegal instructions) even though the shell code generated by my dummy project is identical to the shellcode generated by the library.

commented

Just pushed function execution in 5538702, documentation has also been updated. To test this, you can run this C++ program, copy the address of the function that is printed in the console and then update this JS file and run it.

You can play around with the testAdd function in the C++ program. Just ensure the return type and the arguments of testAdd (or whatever function you use) match the return type and arguments supplied by the JS file.

commented

Just saw this. Great job, this is impressive.

When will it be available in npm?

commented

@Rodriguinho1 available now.

could we get some more documentation on this? i haven’t had any luck executing functions or calling them at all