diff mbox

[RFC] kexec_file: Add support for purgatory built as PIE

Message ID 3476672.cGrsQmUQKg@morokweng (mailing list archive)
State RFC
Headers show

Commit Message

Thiago Jung Bauermann Nov. 2, 2016, 6 a.m. UTC
Hello,

The kexec_file code currently builds the purgatory as a partially linked object 
(using ld -r). Is there a particular reason to use that instead of a position 
independent executable (PIE)?

I found a discussion from 2013 in the archives but from what I understood it 
was about the purgatory as a separate object vs having it linked into the 
kernel, which is different from what I'm asking:

http://lists.infradead.org/pipermail/kexec/2013-December/010535.html

Here is my motivation for this question:

 On ppc64 purgatory.ro has 12 relocation types when built as a partially 
linked object. This makes arch_kexec_apply_relocations_add duplicate a lot of 
code with module_64.c:apply_relocate_add to implement these relocations. The 
alternative is to do some refactoring so that both functions can share the 
implementation of the relocations. This is done in patches 5 and 6 of the 
kexec_file_load implementation for powerpc:

https://lists.ozlabs.org/pipermail/linuxppc-dev/2016-October/149984.html

Michael Ellerman would prefer if module_64.c didn't need to be changed, and 
suggested that the purgatory could be a position independent executable. 
Indeed, in that case there are only 4 relocation types in purgatory.ro (which 
aren't even implemented in module_64.c:apply_relocate_add), so the relocation 
code for the purgatory can leave that file alone and have its own relocation 
implementation.

Also, the purgatory is an executable and not an intermediary output from the 
compiler, so in my mind it makes sense conceptually that it is easier to build 
it as a PIE than as a partially linked object.

The patch below adds the support needed in kexec_file.c to allow powerpc-
specific code to load and relocate a purgatory binary built as PIE. This is WIP 
and can probably be refined a bit. Would you accept a change along these lines?

Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
---
 arch/Kconfig            |   3 +
 kernel/kexec_file.c     | 159 ++++++++++++++++++++++++++++++++++++++++++++++--
 kernel/kexec_internal.h |  26 ++++++++
 3 files changed, 183 insertions(+), 5 deletions(-)

Comments

Baoquan He Nov. 4, 2016, 7:38 a.m. UTC | #1
On 11/02/16 at 04:00am, Thiago Jung Bauermann wrote:
> Hello,
> 
> The kexec_file code currently builds the purgatory as a partially linked object 
> (using ld -r). Is there a particular reason to use that instead of a position 
> independent executable (PIE)?

It's taken as "-r", relocatable in user space kexec-tools too originally.
I think Vivek just keeps it the same when moving into kernel.

> 
> I found a discussion from 2013 in the archives but from what I understood it 
> was about the purgatory as a separate object vs having it linked into the 
> kernel, which is different from what I'm asking:
> 
> http://lists.infradead.org/pipermail/kexec/2013-December/010535.html
> 
> Here is my motivation for this question:
> 
>  On ppc64 purgatory.ro has 12 relocation types when built as a partially 
> linked object. This makes arch_kexec_apply_relocations_add duplicate a lot of 
> code with module_64.c:apply_relocate_add to implement these relocations. The 
> alternative is to do some refactoring so that both functions can share the 
> implementation of the relocations. This is done in patches 5 and 6 of the 
> kexec_file_load implementation for powerpc:

In user space kexec-tools utility, you also got this problem?

> 
> https://lists.ozlabs.org/pipermail/linuxppc-dev/2016-October/149984.html
> 
> Michael Ellerman would prefer if module_64.c didn't need to be changed, and 
> suggested that the purgatory could be a position independent executable. 
> Indeed, in that case there are only 4 relocation types in purgatory.ro (which 
> aren't even implemented in module_64.c:apply_relocate_add), so the relocation 
> code for the purgatory can leave that file alone and have its own relocation 
> implementation.
> 
> Also, the purgatory is an executable and not an intermediary output from the 
> compiler, so in my mind it makes sense conceptually that it is easier to build 
> it as a PIE than as a partially linked object.
> 
> The patch below adds the support needed in kexec_file.c to allow powerpc-
> specific code to load and relocate a purgatory binary built as PIE. This is WIP 
> and can probably be refined a bit. Would you accept a change along these lines?
> 
> Signed-off-by: Thiago Jung Bauermann <bauerman@linux.vnet.ibm.com>
> ---
>  arch/Kconfig            |   3 +
>  kernel/kexec_file.c     | 159 ++++++++++++++++++++++++++++++++++++++++++++++--
>  kernel/kexec_internal.h |  26 ++++++++
>  3 files changed, 183 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/Kconfig b/arch/Kconfig
> index 659bdd079277..7fd6879be222 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -5,6 +5,9 @@
>  config KEXEC_CORE
>  	bool
>  
> +config HAVE_KEXEC_FILE_PIE_PURGATORY
> +	bool
> +
>  config OPROFILE
>  	tristate "OProfile system profiling"
>  	depends on PROFILING
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index 0c2df7f73792..dfc3e015160d 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -633,7 +633,149 @@ static int kexec_calculate_store_digests(struct kimage *image)
>  	return ret;
>  }
>  
> -/* Actually load purgatory. Lot of code taken from kexec-tools */
> +#ifdef CONFIG_HAVE_KEXEC_FILE_PIE_PURGATORY
> +/* Load PIE purgatory using the program header information. */
> +static int __kexec_load_purgatory(struct kimage *image, unsigned long min,
> +				  unsigned long max, int top_down)
> +{
> +	struct purgatory_info *pi = &image->purgatory_info;
> +	unsigned long first_offset;
> +	unsigned long orig_load_addr = 0;
> +	const void *src;
> +	int i, ret;
> +	const Elf_Phdr *phdrs = (const void *) pi->ehdr + pi->ehdr->e_phoff;
> +	const Elf_Phdr *phdr;
> +	const Elf_Shdr *sechdrs_c;
> +	Elf_Shdr *sechdr;
> +	Elf_Shdr *sechdrs = NULL;
> +	struct kexec_buf kbuf = { .image = image, .bufsz = 0, .buf_align = 1,
> +				  .buf_min = min, .buf_max = max,
> +				  .top_down = top_down };
> +
> +	/*
> +	 * sechdrs_c points to section headers in purgatory and are read
> +	 * only. No modifications allowed.
> +	 */
> +	sechdrs_c = (void *) pi->ehdr + pi->ehdr->e_shoff;
> +
> +	/*
> +	 * We can not modify sechdrs_c[] and its fields. It is read only.
> +	 * Copy it over to a local copy where one can store some temporary
> +	 * data and free it at the end. We need to modify ->sh_addr and
> +	 * ->sh_offset fields to keep track of permanent and temporary
> +	 * locations of sections.
> +	 */
> +	sechdrs = vzalloc(pi->ehdr->e_shnum * sizeof(Elf_Shdr));
> +	if (!sechdrs)
> +		return -ENOMEM;
> +
> +	memcpy(sechdrs, sechdrs_c, pi->ehdr->e_shnum * sizeof(Elf_Shdr));
> +
> +	/*
> +	 * We seem to have multiple copies of sections. First copy is which
> +	 * is embedded in kernel in read only section. Some of these sections
> +	 * will be copied to a temporary buffer and relocated. And these
> +	 * sections will finally be copied to their final destination at
> +	 * segment load time.
> +	 *
> +	 * Use ->sh_offset to reflect section address in memory. It will
> +	 * point to original read only copy if section is not allocatable.
> +	 * Otherwise it will point to temporary copy which will be relocated.
> +	 *
> +	 * Use ->sh_addr to contain final address of the section where it
> +	 * will go during execution time.
> +	 */
> +	for (sechdr = sechdrs; sechdr < sechdrs + pi->ehdr->e_shnum; sechdr++) {
> +		if (sechdr->sh_type == SHT_NOBITS)
> +			continue;
> +
> +		sechdr->sh_offset = (unsigned long) pi->ehdr + sechdr->sh_offset;
> +	}
> +
> +	/* Determine how much memory is needed to load the executable. */
> +	for (phdr = phdrs; phdr < phdrs + pi->ehdr->e_phnum; phdr++) {
> +		if (phdr->p_type != PT_LOAD)
> +			continue;
> +
> +		if (!orig_load_addr) {
> +			orig_load_addr = phdr->p_vaddr - phdr->p_offset;
> +			kbuf.bufsz = first_offset = phdr->p_offset;
> +		}
> +
> +		if (kbuf.buf_align < phdr->p_align) {
> +			pr_debug("buf_align was %lx, now is %llx\n",
> +				 kbuf.buf_align, phdr->p_align);
> +			kbuf.buf_align = phdr->p_align;
> +		}
> +
> +		kbuf.bufsz += phdr->p_memsz;
> +	}
> +
> +	/* Allocate buffer for purgatory. */
> +	kbuf.buffer = vzalloc(kbuf.bufsz);
> +	if (!kbuf.buffer) {
> +		ret = -ENOMEM;
> +		goto out;
> +	}
> +
> +	/* Add buffer to segment list. */
> +	kbuf.memsz = kbuf.bufsz;
> +	ret = kexec_add_buffer(&kbuf);
> +	if (ret)
> +		goto out;
> +
> +	pi->purgatory_load_addr = kbuf.mem;
> +
> +	/* Load executable. */
> +	for (phdr = phdrs; phdr < phdrs + pi->ehdr->e_phnum; phdr++) {
> +		if (phdr->p_type != PT_LOAD)
> +			continue;
> +
> +		src = (const void *) pi->ehdr + phdr->p_offset;
> +		memcpy(kbuf.buffer + phdr->p_offset, src, phdr->p_filesz);
> +
> +		pr_debug("loaded segment of size %llx at %llx (base = %lx, offset = %llx)\n",
> +			 phdr->p_memsz, pi->purgatory_load_addr + phdr->p_offset, pi->purgatory_load_addr, phdr->p_offset);
> +
> +		/*
> +		 * Find sections within this segment and update their
> +		 * ->sh_offset to point to within the buffer.
> +		 */
> +		for (i = 0; i < pi->ehdr->e_shnum; i++) {
> +			if (sechdrs[i].sh_addr >= phdr->p_vaddr &&
> +			    sechdrs[i].sh_addr + sechdrs[i].sh_size <= phdr->p_vaddr + phdr->p_memsz) {
> +				sechdrs[i].sh_addr = sechdrs[i].sh_addr - orig_load_addr + pi->purgatory_load_addr;
> +				sechdrs[i].sh_offset = (unsigned long long) kbuf.buffer + sechdrs_c[i].sh_offset;
> +			}
> +		}
> +	}
> +
> +	/* Make kernel jump to purgatory after shutdown */
> +	image->start = pi->ehdr->e_entry - orig_load_addr + pi->purgatory_load_addr;
> +
> +	/* Used later to get/set symbol values */
> +	pi->sechdrs = sechdrs;
> +
> +	/*
> +	 * Used later to identify which section is purgatory and skip it
> +	 * from checksumming.
> +	 */
> +	pi->purgatory_buf = kbuf.buffer;
> +
> +	pr_debug("purgatory entry point at %lx\n", image->start);
> +
> +	return 0;
> +out:
> +	vfree(sechdrs);
> +	vfree(kbuf.buffer);
> +
> +	return ret;
> +}
> +#else /* CONFIG_HAVE_KEXEC_FILE_PIE_PURGATORY */
> +/*
> + * Load relocatable object purgatory using the section header information.
> + * A lot of code taken from kexec-tools.
> + */
>  static int __kexec_load_purgatory(struct kimage *image, unsigned long min,
>  				  unsigned long max, int top_down)
>  {
> @@ -813,6 +955,7 @@ static int __kexec_load_purgatory(struct kimage *image, unsigned long min,
>  	vfree(kbuf.buffer);
>  	return ret;
>  }
> +#endif /* CONFIG_HAVE_KEXEC_FILE_PIE_PURGATORY */
>  
>  static int kexec_apply_relocations(struct kimage *image)
>  {
> @@ -886,7 +1029,7 @@ int kexec_load_purgatory(struct kimage *image, unsigned long min,
>  	pi->ehdr = (Elf_Ehdr *)kexec_purgatory;
>  
>  	if (memcmp(pi->ehdr->e_ident, ELFMAG, SELFMAG) != 0
> -	    || pi->ehdr->e_type != ET_REL
> +	    || pi->ehdr->e_type != PURGATORY_ELF_TYPE
>  	    || !elf_check_arch(pi->ehdr)
>  	    || pi->ehdr->e_shentsize != sizeof(Elf_Shdr))
>  		return -ENOEXEC;
> @@ -942,7 +1085,13 @@ static Elf_Sym *kexec_purgatory_find_symbol(struct purgatory_info *pi,
>  
>  		/* Go through symbols for a match */
>  		for (k = 0; k < sechdrs[i].sh_size/sizeof(Elf_Sym); k++) {
> -			if (ELF_ST_BIND(syms[k].st_info) != STB_GLOBAL)
> +			/*
> +			 * FIXME: See if we can or should export the .TOC.
> +			 * symbol as global instead of searching local symbols
> +			 * here.
> +			 */
> +			if (ELF_ST_BIND(syms[k].st_info) != STB_GLOBAL &&
> +			    ELF_ST_BIND(syms[k].st_info) != STB_LOCAL)
>  				continue;
>  
>  			if (strcmp(strtab + syms[k].st_name, name) != 0)
> @@ -979,7 +1128,7 @@ void *kexec_purgatory_get_symbol_addr(struct kimage *image, const char *name)
>  	 * Returns the address where symbol will finally be loaded after
>  	 * kexec_load_segment()
>  	 */
> -	return (void *)(sechdr->sh_addr + sym->st_value);
> +	return (void *)(sechdr->sh_addr + sym_value_offset(pi, sym));
>  }
>  
>  /*
> @@ -1013,7 +1162,7 @@ int kexec_purgatory_get_set_symbol(struct kimage *image, const char *name,
>  	}
>  
>  	sym_buf = (unsigned char *)sechdrs[sym->st_shndx].sh_offset +
> -					sym->st_value;
> +					sym_value_offset(pi, sym);
>  
>  	if (get_value)
>  		memcpy((void *)buf, sym_buf, size);
> diff --git a/kernel/kexec_internal.h b/kernel/kexec_internal.h
> index 4cef7e4706b0..c253b00f88d0 100644
> --- a/kernel/kexec_internal.h
> +++ b/kernel/kexec_internal.h
> @@ -20,6 +20,32 @@ struct kexec_sha_region {
>  	unsigned long len;
>  };
>  
> +#ifdef CONFIG_HAVE_KEXEC_FILE_PIE_PURGATORY
> +#define PURGATORY_ELF_TYPE ET_EXEC
> +
> +/*
> + * In position-independent executables, the symbol value is an absolute address,
> + * so convert it to a section-relative offset.
> + */
> +static inline Elf_Addr sym_value_offset(struct purgatory_info *pi, Elf_Sym *sym)
> +{
> +	const Elf_Shdr *sechdrs_c = (const void *) pi->ehdr + pi->ehdr->e_shoff;
> +
> +	return sym->st_value - sechdrs_c[sym->st_shndx].sh_addr;
> +}
> +#else
> +#define PURGATORY_ELF_TYPE ET_REL
> +
> +/*
> + * In a relocatable object, the symbol value already is a section-relative
> + * offset.
> + */
> +static inline Elf_Addr sym_value_offset(struct purgatory_info *pi, Elf_Sym *sym)
> +{
> +	return sym->st_value;
> +}
> +#endif
> +
>  void kimage_file_post_load_cleanup(struct kimage *image);
>  #else /* CONFIG_KEXEC_FILE */
>  static inline void kimage_file_post_load_cleanup(struct kimage *image) { }
> -- 
> 2.7.4
> 
> 
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
Eric W. Biederman Nov. 4, 2016, 3:13 p.m. UTC | #2
Baoquan He <bhe@redhat.com> writes:

> On 11/02/16 at 04:00am, Thiago Jung Bauermann wrote:
>> Hello,
>> 
>> The kexec_file code currently builds the purgatory as a partially linked object 
>> (using ld -r). Is there a particular reason to use that instead of a position 
>> independent executable (PIE)?
>
> It's taken as "-r", relocatable in user space kexec-tools too originally.
> I think Vivek just keeps it the same when moving into kernel.

At least on x86 using just -r removed the need for a GOT and all of the
other nasty dynamic relocatable bits, that are not needed when the you
don't want to share your text bits with the page cache.

I can see reaons for refactoring code but I expect PIE expecutables need
a GOT and all of that pain in the neck stuff that can just be avoided by
building the code to run at an absolute address.

So far I have not seen ELF relocations that are difficult to process.

Eric
Thiago Jung Bauermann Nov. 4, 2016, 3:43 p.m. UTC | #3
Hello Baoquan,

Am Freitag, 4. November 2016, 15:38:40 BRST schrieb Baoquan He:
> On 11/02/16 at 04:00am, Thiago Jung Bauermann wrote:
> > Hello,
> > 
> > The kexec_file code currently builds the purgatory as a partially linked
> > object (using ld -r). Is there a particular reason to use that instead of
> > a position independent executable (PIE)?
> 
> It's taken as "-r", relocatable in user space kexec-tools too originally.
> I think Vivek just keeps it the same when moving into kernel.

Ok. If that's the only reason then PIE is better suited at least for powerpc.

> > I found a discussion from 2013 in the archives but from what I understood
> > it was about the purgatory as a separate object vs having it linked into
> > the kernel, which is different from what I'm asking:
> > 
> > http://lists.infradead.org/pipermail/kexec/2013-December/010535.html
> > 
> > Here is my motivation for this question:
> >  On ppc64 purgatory.ro has 12 relocation types when built as a partially
> > 
> > linked object. This makes arch_kexec_apply_relocations_add duplicate a lot
> > of code with module_64.c:apply_relocate_add to implement these
> > relocations. The alternative is to do some refactoring so that both
> > functions can share the implementation of the relocations. This is done
> > in patches 5 and 6 of the
> > kexec_file_load implementation for powerpc:
> In user space kexec-tools utility, you also got this problem?

Yes, kexec-tools' purgatory.ro has 10 relocation types instead of 12 (I don't 
know why), but that's still a lot.

> > @@ -942,7 +1085,13 @@ static Elf_Sym *kexec_purgatory_find_symbol(struct
> > purgatory_info *pi,> 
> >  		/* Go through symbols for a match */
> >  		for (k = 0; k < sechdrs[i].sh_size/sizeof(Elf_Sym); k++) {
> > 
> > -			if (ELF_ST_BIND(syms[k].st_info) != STB_GLOBAL)
> > +			/*
> > +			 * FIXME: See if we can or should export the .TOC.
> > +			 * symbol as global instead of searching local symbols
> > +			 * here.
> > +			 */
> > +			if (ELF_ST_BIND(syms[k].st_info) != STB_GLOBAL &&
> > +			    ELF_ST_BIND(syms[k].st_info) != STB_LOCAL)
> > 
> >  				continue;
> >  			
> >  			if (strcmp(strtab + syms[k].st_name, name) != 0)
> > 

I don't need the change above anymore. I found a way to obtain the TOC pointer 
without looking for the .TOC. symbol.
Thiago Jung Bauermann Nov. 4, 2016, 7:19 p.m. UTC | #4
Hello Eric,

Am Freitag, 4. November 2016, 10:13:39 BRST schrieb Eric W. Biederman:
> Baoquan He <bhe@redhat.com> writes:
> > On 11/02/16 at 04:00am, Thiago Jung Bauermann wrote:
> >> Hello,
> >> 
> >> The kexec_file code currently builds the purgatory as a partially linked
> >> object (using ld -r). Is there a particular reason to use that instead
> >> of a position independent executable (PIE)?
> > 
> > It's taken as "-r", relocatable in user space kexec-tools too originally.
> > I think Vivek just keeps it the same when moving into kernel.
> 
> At least on x86 using just -r removed the need for a GOT and all of the
> other nasty dynamic relocatable bits, that are not needed when the you
> don't want to share your text bits with the page cache.
> 
> I can see reaons for refactoring code but I expect PIE expecutables need
> a GOT and all of that pain in the neck stuff that can just be avoided by
> building the code to run at an absolute address.

At least on powerpc, building the purgatory as PIE resulted in only the 
following differences:

1. A lot less relocation types to deal with.
2. __kexec_load_purgatory needs to use the program headers rather than the 
   section headers to figure out how to load the binary.
3. Symbol values are absolute addresses instead of relative to the start of 
   the section.

2. is an advantage too because it's actually easier to use the program headers 
because unlike section headers, the purpose of program headers is to provide 
the information needed by a program loader. You can see this by comparing the 
two implementations of __kexec_load_purgatory in the WIP patch I posted. The 
one using program headers is simpler.

3. isn't a problem, it's easy to convert the absolute addresses back into 
relative ones, as can be seen in my patch.

> So far I have not seen ELF relocations that are difficult to process.

The problem is not that it's difficult to process, but that on powerpc it 
takes a lot of code to implement that processing. In v9 of the kexec_file_load 
implementation for powerpc, the switch statement implementing all the 
relocation types (shared by powerpc's module_64.c and machine_kexec_file_64.c) 
has 200 lines. The switch statement implementing only the relocation types 
used by the PIE purgatory has 26 lines.

This is not a problem in x86, though: the purgatory built as a relocatable 
object has only two relocation types.
diff mbox

Patch

diff --git a/arch/Kconfig b/arch/Kconfig
index 659bdd079277..7fd6879be222 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -5,6 +5,9 @@ 
 config KEXEC_CORE
 	bool
 
+config HAVE_KEXEC_FILE_PIE_PURGATORY
+	bool
+
 config OPROFILE
 	tristate "OProfile system profiling"
 	depends on PROFILING
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 0c2df7f73792..dfc3e015160d 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -633,7 +633,149 @@  static int kexec_calculate_store_digests(struct kimage *image)
 	return ret;
 }
 
-/* Actually load purgatory. Lot of code taken from kexec-tools */
+#ifdef CONFIG_HAVE_KEXEC_FILE_PIE_PURGATORY
+/* Load PIE purgatory using the program header information. */
+static int __kexec_load_purgatory(struct kimage *image, unsigned long min,
+				  unsigned long max, int top_down)
+{
+	struct purgatory_info *pi = &image->purgatory_info;
+	unsigned long first_offset;
+	unsigned long orig_load_addr = 0;
+	const void *src;
+	int i, ret;
+	const Elf_Phdr *phdrs = (const void *) pi->ehdr + pi->ehdr->e_phoff;
+	const Elf_Phdr *phdr;
+	const Elf_Shdr *sechdrs_c;
+	Elf_Shdr *sechdr;
+	Elf_Shdr *sechdrs = NULL;
+	struct kexec_buf kbuf = { .image = image, .bufsz = 0, .buf_align = 1,
+				  .buf_min = min, .buf_max = max,
+				  .top_down = top_down };
+
+	/*
+	 * sechdrs_c points to section headers in purgatory and are read
+	 * only. No modifications allowed.
+	 */
+	sechdrs_c = (void *) pi->ehdr + pi->ehdr->e_shoff;
+
+	/*
+	 * We can not modify sechdrs_c[] and its fields. It is read only.
+	 * Copy it over to a local copy where one can store some temporary
+	 * data and free it at the end. We need to modify ->sh_addr and
+	 * ->sh_offset fields to keep track of permanent and temporary
+	 * locations of sections.
+	 */
+	sechdrs = vzalloc(pi->ehdr->e_shnum * sizeof(Elf_Shdr));
+	if (!sechdrs)
+		return -ENOMEM;
+
+	memcpy(sechdrs, sechdrs_c, pi->ehdr->e_shnum * sizeof(Elf_Shdr));
+
+	/*
+	 * We seem to have multiple copies of sections. First copy is which
+	 * is embedded in kernel in read only section. Some of these sections
+	 * will be copied to a temporary buffer and relocated. And these
+	 * sections will finally be copied to their final destination at
+	 * segment load time.
+	 *
+	 * Use ->sh_offset to reflect section address in memory. It will
+	 * point to original read only copy if section is not allocatable.
+	 * Otherwise it will point to temporary copy which will be relocated.
+	 *
+	 * Use ->sh_addr to contain final address of the section where it
+	 * will go during execution time.
+	 */
+	for (sechdr = sechdrs; sechdr < sechdrs + pi->ehdr->e_shnum; sechdr++) {
+		if (sechdr->sh_type == SHT_NOBITS)
+			continue;
+
+		sechdr->sh_offset = (unsigned long) pi->ehdr + sechdr->sh_offset;
+	}
+
+	/* Determine how much memory is needed to load the executable. */
+	for (phdr = phdrs; phdr < phdrs + pi->ehdr->e_phnum; phdr++) {
+		if (phdr->p_type != PT_LOAD)
+			continue;
+
+		if (!orig_load_addr) {
+			orig_load_addr = phdr->p_vaddr - phdr->p_offset;
+			kbuf.bufsz = first_offset = phdr->p_offset;
+		}
+
+		if (kbuf.buf_align < phdr->p_align) {
+			pr_debug("buf_align was %lx, now is %llx\n",
+				 kbuf.buf_align, phdr->p_align);
+			kbuf.buf_align = phdr->p_align;
+		}
+
+		kbuf.bufsz += phdr->p_memsz;
+	}
+
+	/* Allocate buffer for purgatory. */
+	kbuf.buffer = vzalloc(kbuf.bufsz);
+	if (!kbuf.buffer) {
+		ret = -ENOMEM;
+		goto out;
+	}
+
+	/* Add buffer to segment list. */
+	kbuf.memsz = kbuf.bufsz;
+	ret = kexec_add_buffer(&kbuf);
+	if (ret)
+		goto out;
+
+	pi->purgatory_load_addr = kbuf.mem;
+
+	/* Load executable. */
+	for (phdr = phdrs; phdr < phdrs + pi->ehdr->e_phnum; phdr++) {
+		if (phdr->p_type != PT_LOAD)
+			continue;
+
+		src = (const void *) pi->ehdr + phdr->p_offset;
+		memcpy(kbuf.buffer + phdr->p_offset, src, phdr->p_filesz);
+
+		pr_debug("loaded segment of size %llx at %llx (base = %lx, offset = %llx)\n",
+			 phdr->p_memsz, pi->purgatory_load_addr + phdr->p_offset, pi->purgatory_load_addr, phdr->p_offset);
+
+		/*
+		 * Find sections within this segment and update their
+		 * ->sh_offset to point to within the buffer.
+		 */
+		for (i = 0; i < pi->ehdr->e_shnum; i++) {
+			if (sechdrs[i].sh_addr >= phdr->p_vaddr &&
+			    sechdrs[i].sh_addr + sechdrs[i].sh_size <= phdr->p_vaddr + phdr->p_memsz) {
+				sechdrs[i].sh_addr = sechdrs[i].sh_addr - orig_load_addr + pi->purgatory_load_addr;
+				sechdrs[i].sh_offset = (unsigned long long) kbuf.buffer + sechdrs_c[i].sh_offset;
+			}
+		}
+	}
+
+	/* Make kernel jump to purgatory after shutdown */
+	image->start = pi->ehdr->e_entry - orig_load_addr + pi->purgatory_load_addr;
+
+	/* Used later to get/set symbol values */
+	pi->sechdrs = sechdrs;
+
+	/*
+	 * Used later to identify which section is purgatory and skip it
+	 * from checksumming.
+	 */
+	pi->purgatory_buf = kbuf.buffer;
+
+	pr_debug("purgatory entry point at %lx\n", image->start);
+
+	return 0;
+out:
+	vfree(sechdrs);
+	vfree(kbuf.buffer);
+
+	return ret;
+}
+#else /* CONFIG_HAVE_KEXEC_FILE_PIE_PURGATORY */
+/*
+ * Load relocatable object purgatory using the section header information.
+ * A lot of code taken from kexec-tools.
+ */
 static int __kexec_load_purgatory(struct kimage *image, unsigned long min,
 				  unsigned long max, int top_down)
 {
@@ -813,6 +955,7 @@  static int __kexec_load_purgatory(struct kimage *image, unsigned long min,
 	vfree(kbuf.buffer);
 	return ret;
 }
+#endif /* CONFIG_HAVE_KEXEC_FILE_PIE_PURGATORY */
 
 static int kexec_apply_relocations(struct kimage *image)
 {
@@ -886,7 +1029,7 @@  int kexec_load_purgatory(struct kimage *image, unsigned long min,
 	pi->ehdr = (Elf_Ehdr *)kexec_purgatory;
 
 	if (memcmp(pi->ehdr->e_ident, ELFMAG, SELFMAG) != 0
-	    || pi->ehdr->e_type != ET_REL
+	    || pi->ehdr->e_type != PURGATORY_ELF_TYPE
 	    || !elf_check_arch(pi->ehdr)
 	    || pi->ehdr->e_shentsize != sizeof(Elf_Shdr))
 		return -ENOEXEC;
@@ -942,7 +1085,13 @@  static Elf_Sym *kexec_purgatory_find_symbol(struct purgatory_info *pi,
 
 		/* Go through symbols for a match */
 		for (k = 0; k < sechdrs[i].sh_size/sizeof(Elf_Sym); k++) {
-			if (ELF_ST_BIND(syms[k].st_info) != STB_GLOBAL)
+			/*
+			 * FIXME: See if we can or should export the .TOC.
+			 * symbol as global instead of searching local symbols
+			 * here.
+			 */
+			if (ELF_ST_BIND(syms[k].st_info) != STB_GLOBAL &&
+			    ELF_ST_BIND(syms[k].st_info) != STB_LOCAL)
 				continue;
 
 			if (strcmp(strtab + syms[k].st_name, name) != 0)
@@ -979,7 +1128,7 @@  void *kexec_purgatory_get_symbol_addr(struct kimage *image, const char *name)
 	 * Returns the address where symbol will finally be loaded after
 	 * kexec_load_segment()
 	 */
-	return (void *)(sechdr->sh_addr + sym->st_value);
+	return (void *)(sechdr->sh_addr + sym_value_offset(pi, sym));
 }
 
 /*
@@ -1013,7 +1162,7 @@  int kexec_purgatory_get_set_symbol(struct kimage *image, const char *name,
 	}
 
 	sym_buf = (unsigned char *)sechdrs[sym->st_shndx].sh_offset +
-					sym->st_value;
+					sym_value_offset(pi, sym);
 
 	if (get_value)
 		memcpy((void *)buf, sym_buf, size);
diff --git a/kernel/kexec_internal.h b/kernel/kexec_internal.h
index 4cef7e4706b0..c253b00f88d0 100644
--- a/kernel/kexec_internal.h
+++ b/kernel/kexec_internal.h
@@ -20,6 +20,32 @@  struct kexec_sha_region {
 	unsigned long len;
 };
 
+#ifdef CONFIG_HAVE_KEXEC_FILE_PIE_PURGATORY
+#define PURGATORY_ELF_TYPE ET_EXEC
+
+/*
+ * In position-independent executables, the symbol value is an absolute address,
+ * so convert it to a section-relative offset.
+ */
+static inline Elf_Addr sym_value_offset(struct purgatory_info *pi, Elf_Sym *sym)
+{
+	const Elf_Shdr *sechdrs_c = (const void *) pi->ehdr + pi->ehdr->e_shoff;
+
+	return sym->st_value - sechdrs_c[sym->st_shndx].sh_addr;
+}
+#else
+#define PURGATORY_ELF_TYPE ET_REL
+
+/*
+ * In a relocatable object, the symbol value already is a section-relative
+ * offset.
+ */
+static inline Elf_Addr sym_value_offset(struct purgatory_info *pi, Elf_Sym *sym)
+{
+	return sym->st_value;
+}
+#endif
+
 void kimage_file_post_load_cleanup(struct kimage *image);
 #else /* CONFIG_KEXEC_FILE */
 static inline void kimage_file_post_load_cleanup(struct kimage *image) { }