Message ID | 1501003621-9478-1-git-send-email-marcelo.cerri@canonical.com |
---|---|
State | New |
Headers | show |
On 25.07.2017 19:27, Marcelo Henrique Cerri wrote: > BugLink: http://bugs.launchpad.net/bugs/1701744 > > Azure host OS has major and minor version numbers. Rename the following > modules: > > drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/vmbus-rdma-142.ko > drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/vmbus-rdma-144.ko > > To: > > drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/vmbus-rdma-142.0.ko > drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/vmbus-rdma-144.0.ko > > Signed-off-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> > --- > drivers/infiniband/hw/vmbus-rdma/Makefile | 4 +- > .../hw/vmbus-rdma/vmbus-rdma-142.0/COPYING | 339 +++ > .../hw/vmbus-rdma/vmbus-rdma-142.0/Makefile | 3 + > .../hw/vmbus-rdma/vmbus-rdma-142.0/hvnd_addr.c | 136 + > .../hw/vmbus-rdma/vmbus-rdma-142.0/mx_abi.h | 232 ++ > .../hw/vmbus-rdma/vmbus-rdma-142.0/provider.c | 2961 +++++++++++++++++++ > .../hw/vmbus-rdma/vmbus-rdma-142.0/user.h | 107 + > .../hw/vmbus-rdma/vmbus-rdma-142.0/vmbus_rdma.c | 3074 ++++++++++++++++++++ > .../hw/vmbus-rdma/vmbus-rdma-142.0/vmbus_rdma.h | 2202 ++++++++++++++ > .../hw/vmbus-rdma/vmbus-rdma-142/COPYING | 339 --- > .../hw/vmbus-rdma/vmbus-rdma-142/Makefile | 3 - > .../hw/vmbus-rdma/vmbus-rdma-142/hvnd_addr.c | 136 - > .../hw/vmbus-rdma/vmbus-rdma-142/mx_abi.h | 232 -- > .../hw/vmbus-rdma/vmbus-rdma-142/provider.c | 2961 ------------------- > .../infiniband/hw/vmbus-rdma/vmbus-rdma-142/user.h | 107 - > .../hw/vmbus-rdma/vmbus-rdma-142/vmbus_rdma.c | 3074 -------------------- > .../hw/vmbus-rdma/vmbus-rdma-142/vmbus_rdma.h | 2202 -------------- > .../hw/vmbus-rdma/vmbus-rdma-144.0/COPYING | 339 +++ > .../hw/vmbus-rdma/vmbus-rdma-144.0/Makefile | 3 + > .../hw/vmbus-rdma/vmbus-rdma-144.0/hvnd_addr.c | 136 + > .../hw/vmbus-rdma/vmbus-rdma-144.0/mx_abi.h | 232 ++ > .../hw/vmbus-rdma/vmbus-rdma-144.0/provider.c | 2961 +++++++++++++++++++ > .../hw/vmbus-rdma/vmbus-rdma-144.0/user.h | 107 + > .../hw/vmbus-rdma/vmbus-rdma-144.0/vmbus_rdma.c | 3074 ++++++++++++++++++++ > .../hw/vmbus-rdma/vmbus-rdma-144.0/vmbus_rdma.h | 2202 ++++++++++++++ > .../hw/vmbus-rdma/vmbus-rdma-144/COPYING | 339 --- > .../hw/vmbus-rdma/vmbus-rdma-144/Makefile | 3 - > .../hw/vmbus-rdma/vmbus-rdma-144/hvnd_addr.c | 136 - > .../hw/vmbus-rdma/vmbus-rdma-144/mx_abi.h | 232 -- > .../hw/vmbus-rdma/vmbus-rdma-144/provider.c | 2961 ------------------- > .../infiniband/hw/vmbus-rdma/vmbus-rdma-144/user.h | 107 - > .../hw/vmbus-rdma/vmbus-rdma-144/vmbus_rdma.c | 3074 -------------------- > .../hw/vmbus-rdma/vmbus-rdma-144/vmbus_rdma.h | 2202 -------------- > 33 files changed, 18110 insertions(+), 18110 deletions(-) > create mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/COPYING > create mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/Makefile > create mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/hvnd_addr.c > create mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/mx_abi.h > create mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/provider.c > create mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/user.h > create mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/vmbus_rdma.c > create mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/vmbus_rdma.h > delete mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/COPYING > delete mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/Makefile > delete mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/hvnd_addr.c > delete mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/mx_abi.h > delete mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/provider.c > delete mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/user.h > delete mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/vmbus_rdma.c > delete mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/vmbus_rdma.h > create mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/COPYING > create mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/Makefile > create mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/hvnd_addr.c > create mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/mx_abi.h > create mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/provider.c > create mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/user.h > create mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/vmbus_rdma.c > create mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/vmbus_rdma.h > delete mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/COPYING > delete mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/Makefile > delete mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/hvnd_addr.c > delete mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/mx_abi.h > delete mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/provider.c > delete mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/user.h > delete mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/vmbus_rdma.c > delete mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/vmbus_rdma.h > > diff --git a/drivers/infiniband/hw/vmbus-rdma/Makefile b/drivers/infiniband/hw/vmbus-rdma/Makefile > index b6e900002e2f..9a0cb3cff61b 100644 > --- a/drivers/infiniband/hw/vmbus-rdma/Makefile > +++ b/drivers/infiniband/hw/vmbus-rdma/Makefile > @@ -1,2 +1,2 @@ > -obj-$(CONFIG_HYPERV_INFINIBAND_ND) += vmbus-rdma-142/ > -obj-$(CONFIG_HYPERV_INFINIBAND_ND) += vmbus-rdma-144/ > +obj-$(CONFIG_HYPERV_INFINIBAND_ND) += vmbus-rdma-142.0/ > +obj-$(CONFIG_HYPERV_INFINIBAND_ND) += vmbus-rdma-144.0/ > diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/COPYING b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/COPYING > new file mode 100644 > index 000000000000..f0960c946b07 > --- /dev/null > +++ b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/COPYING > @@ -0,0 +1,339 @@ > + > +"This software program is licensed subject to the GNU General Public License > +(GPL). Version 2, June 1991, available at > +<http://www.fsf.org/copyleft/gpl.html>" > + > +GNU General Public License > + > +Version 2, June 1991 > + > +Copyright (C) 1989, 1991 Free Software Foundation, Inc. > +51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA > + > +Everyone is permitted to copy and distribute verbatim copies of this license > +document, but changing it is not allowed. > + > +Preamble > + > +The licenses for most software are designed to take away your freedom to > +share and change it. By contrast, the GNU General Public License is intended > +to guarantee your freedom to share and change free software--to make sure > +the software is free for all its users. This General Public License applies > +to most of the Free Software Foundation's software and to any other program > +whose authors commit to using it. (Some other Free Software Foundation > +software is covered by the GNU Library General Public License instead.) You > +can apply it to your programs, too. > + > +When we speak of free software, we are referring to freedom, not price. Our > +General Public Licenses are designed to make sure that you have the freedom > +to distribute copies of free software (and charge for this service if you > +wish), that you receive source code or can get it if you want it, that you > +can change the software or use pieces of it in new free programs; and that > +you know you can do these things. > + > +To protect your rights, we need to make restrictions that forbid anyone to > +deny you these rights or to ask you to surrender the rights. These > +restrictions translate to certain responsibilities for you if you distribute > +copies of the software, or if you modify it. > + > +For example, if you distribute copies of such a program, whether gratis or > +for a fee, you must give the recipients all the rights that you have. You > +must make sure that they, too, receive or can get the source code. And you > +must show them these terms so they know their rights. > + > +We protect your rights with two steps: (1) copyright the software, and (2) > +offer you this license which gives you legal permission to copy, distribute > +and/or modify the software. > + > +Also, for each author's protection and ours, we want to make certain that > +everyone understands that there is no warranty for this free software. If > +the software is modified by someone else and passed on, we want its > +recipients to know that what they have is not the original, so that any > +problems introduced by others will not reflect on the original authors' > +reputations. > + > +Finally, any free program is threatened constantly by software patents. We > +wish to avoid the danger that redistributors of a free program will > +individually obtain patent licenses, in effect making the program > +proprietary. To prevent this, we have made it clear that any patent must be > +licensed for everyone's free use or not licensed at all. > + > +The precise terms and conditions for copying, distribution and modification > +follow. > + > +TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION > + > +0. This License applies to any program or other work which contains a notice > + placed by the copyright holder saying it may be distributed under the > + terms of this General Public License. The "Program", below, refers to any > + such program or work, and a "work based on the Program" means either the > + Program or any derivative work under copyright law: that is to say, a > + work containing the Program or a portion of it, either verbatim or with > + modifications and/or translated into another language. (Hereinafter, > + translation is included without limitation in the term "modification".) > + Each licensee is addressed as "you". > + > + Activities other than copying, distribution and modification are not > + covered by this License; they are outside its scope. The act of running > + the Program is not restricted, and the output from the Program is covered > + only if its contents constitute a work based on the Program (independent > + of having been made by running the Program). Whether that is true depends > + on what the Program does. > + > +1. You may copy and distribute verbatim copies of the Program's source code > + as you receive it, in any medium, provided that you conspicuously and > + appropriately publish on each copy an appropriate copyright notice and > + disclaimer of warranty; keep intact all the notices that refer to this > + License and to the absence of any warranty; and give any other recipients > + of the Program a copy of this License along with the Program. > + > + You may charge a fee for the physical act of transferring a copy, and you > + may at your option offer warranty protection in exchange for a fee. > + > +2. You may modify your copy or copies of the Program or any portion of it, > + thus forming a work based on the Program, and copy and distribute such > + modifications or work under the terms of Section 1 above, provided that > + you also meet all of these conditions: > + > + * a) You must cause the modified files to carry prominent notices stating > + that you changed the files and the date of any change. > + > + * b) You must cause any work that you distribute or publish, that in > + whole or in part contains or is derived from the Program or any part > + thereof, to be licensed as a whole at no charge to all third parties > + under the terms of this License. > + > + * c) If the modified program normally reads commands interactively when > + run, you must cause it, when started running for such interactive > + use in the most ordinary way, to print or display an announcement > + including an appropriate copyright notice and a notice that there is > + no warranty (or else, saying that you provide a warranty) and that > + users may redistribute the program under these conditions, and > + telling the user how to view a copy of this License. (Exception: if > + the Program itself is interactive but does not normally print such > + an announcement, your work based on the Program is not required to > + print an announcement.) > + > + These requirements apply to the modified work as a whole. If identifiable > + sections of that work are not derived from the Program, and can be > + reasonably considered independent and separate works in themselves, then > + this License, and its terms, do not apply to those sections when you > + distribute them as separate works. But when you distribute the same > + sections as part of a whole which is a work based on the Program, the > + distribution of the whole must be on the terms of this License, whose > + permissions for other licensees extend to the entire whole, and thus to > + each and every part regardless of who wrote it. > + > + Thus, it is not the intent of this section to claim rights or contest > + your rights to work written entirely by you; rather, the intent is to > + exercise the right to control the distribution of derivative or > + collective works based on the Program. > + > + In addition, mere aggregation of another work not based on the Program > + with the Program (or with a work based on the Program) on a volume of a > + storage or distribution medium does not bring the other work under the > + scope of this License. > + > +3. You may copy and distribute the Program (or a work based on it, under > + Section 2) in object code or executable form under the terms of Sections > + 1 and 2 above provided that you also do one of the following: > + > + * a) Accompany it with the complete corresponding machine-readable source > + code, which must be distributed under the terms of Sections 1 and 2 > + above on a medium customarily used for software interchange; or, > + > + * b) Accompany it with a written offer, valid for at least three years, > + to give any third party, for a charge no more than your cost of > + physically performing source distribution, a complete machine- > + readable copy of the corresponding source code, to be distributed > + under the terms of Sections 1 and 2 above on a medium customarily > + used for software interchange; or, > + > + * c) Accompany it with the information you received as to the offer to > + distribute corresponding source code. (This alternative is allowed > + only for noncommercial distribution and only if you received the > + program in object code or executable form with such an offer, in > + accord with Subsection b above.) > + > + The source code for a work means the preferred form of the work for > + making modifications to it. For an executable work, complete source code > + means all the source code for all modules it contains, plus any > + associated interface definition files, plus the scripts used to control > + compilation and installation of the executable. However, as a special > + exception, the source code distributed need not include anything that is > + normally distributed (in either source or binary form) with the major > + components (compiler, kernel, and so on) of the operating system on which > + the executable runs, unless that component itself accompanies the > + executable. > + > + If distribution of executable or object code is made by offering access > + to copy from a designated place, then offering equivalent access to copy > + the source code from the same place counts as distribution of the source > + code, even though third parties are not compelled to copy the source > + along with the object code. > + > +4. You may not copy, modify, sublicense, or distribute the Program except as > + expressly provided under this License. Any attempt otherwise to copy, > + modify, sublicense or distribute the Program is void, and will > + automatically terminate your rights under this License. However, parties > + who have received copies, or rights, from you under this License will not > + have their licenses terminated so long as such parties remain in full > + compliance. > + > +5. You are not required to accept this License, since you have not signed > + it. However, nothing else grants you permission to modify or distribute > + the Program or its derivative works. These actions are prohibited by law > + if you do not accept this License. Therefore, by modifying or > + distributing the Program (or any work based on the Program), you > + indicate your acceptance of this License to do so, and all its terms and > + conditions for copying, distributing or modifying the Program or works > + based on it. > + > +6. Each time you redistribute the Program (or any work based on the > + Program), the recipient automatically receives a license from the > + original licensor to copy, distribute or modify the Program subject to > + these terms and conditions. You may not impose any further restrictions > + on the recipients' exercise of the rights granted herein. You are not > + responsible for enforcing compliance by third parties to this License. > + > +7. If, as a consequence of a court judgment or allegation of patent > + infringement or for any other reason (not limited to patent issues), > + conditions are imposed on you (whether by court order, agreement or > + otherwise) that contradict the conditions of this License, they do not > + excuse you from the conditions of this License. If you cannot distribute > + so as to satisfy simultaneously your obligations under this License and > + any other pertinent obligations, then as a consequence you may not > + distribute the Program at all. For example, if a patent license would > + not permit royalty-free redistribution of the Program by all those who > + receive copies directly or indirectly through you, then the only way you > + could satisfy both it and this License would be to refrain entirely from > + distribution of the Program. > + > + If any portion of this section is held invalid or unenforceable under any > + particular circumstance, the balance of the section is intended to apply > + and the section as a whole is intended to apply in other circumstances. > + > + It is not the purpose of this section to induce you to infringe any > + patents or other property right claims or to contest validity of any > + such claims; this section has the sole purpose of protecting the > + integrity of the free software distribution system, which is implemented > + by public license practices. Many people have made generous contributions > + to the wide range of software distributed through that system in > + reliance on consistent application of that system; it is up to the > + author/donor to decide if he or she is willing to distribute software > + through any other system and a licensee cannot impose that choice. > + > + This section is intended to make thoroughly clear what is believed to be > + a consequence of the rest of this License. > + > +8. If the distribution and/or use of the Program is restricted in certain > + countries either by patents or by copyrighted interfaces, the original > + copyright holder who places the Program under this License may add an > + explicit geographical distribution limitation excluding those countries, > + so that distribution is permitted only in or among countries not thus > + excluded. In such case, this License incorporates the limitation as if > + written in the body of this License. > + > +9. The Free Software Foundation may publish revised and/or new versions of > + the General Public License from time to time. Such new versions will be > + similar in spirit to the present version, but may differ in detail to > + address new problems or concerns. > + > + Each version is given a distinguishing version number. If the Program > + specifies a version number of this License which applies to it and "any > + later version", you have the option of following the terms and > + conditions either of that version or of any later version published by > + the Free Software Foundation. If the Program does not specify a version > + number of this License, you may choose any version ever published by the > + Free Software Foundation. > + > +10. If you wish to incorporate parts of the Program into other free programs > + whose distribution conditions are different, write to the author to ask > + for permission. For software which is copyrighted by the Free Software > + Foundation, write to the Free Software Foundation; we sometimes make > + exceptions for this. Our decision will be guided by the two goals of > + preserving the free status of all derivatives of our free software and > + of promoting the sharing and reuse of software generally. > + > + NO WARRANTY > + > +11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY > + FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN > + OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES > + PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER > + EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED > + WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE > + ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH > + YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL > + NECESSARY SERVICING, REPAIR OR CORRECTION. > + > +12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING > + WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR > + REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR > + DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL > + DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM > + (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED > + INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF > + THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR > + OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. > + > +END OF TERMS AND CONDITIONS > + > +How to Apply These Terms to Your New Programs > + > +If you develop a new program, and you want it to be of the greatest > +possible use to the public, the best way to achieve this is to make it free > +software which everyone can redistribute and change under these terms. > + > +To do so, attach the following notices to the program. It is safest to > +attach them to the start of each source file to most effectively convey the > +exclusion of warranty; and each file should have at least the "copyright" > +line and a pointer to where the full notice is found. > + > +one line to give the program's name and an idea of what it does. > +Copyright (C) yyyy name of author > + > +This program is free software; you can redistribute it and/or modify it > +under the terms of the GNU General Public License as published by the Free > +Software Foundation; either version 2 of the License, or (at your option) > +any later version. > + > +This program is distributed in the hope that it will be useful, but WITHOUT > +ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or > +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for > +more details. > + > +You should have received a copy of the GNU General Public License along with > +this program; if not, write to the Free Software Foundation, Inc., 59 > +Temple Place - Suite 330, Boston, MA 02111-1307, USA. > + > +Also add information on how to contact you by electronic and paper mail. > + > +If the program is interactive, make it output a short notice like this when > +it starts in an interactive mode: > + > +Gnomovision version 69, Copyright (C) year name of author Gnomovision comes > +with ABSOLUTELY NO WARRANTY; for details type 'show w'. This is free > +software, and you are welcome to redistribute it under certain conditions; > +type 'show c' for details. > + > +The hypothetical commands 'show w' and 'show c' should show the appropriate > +parts of the General Public License. Of course, the commands you use may be > +called something other than 'show w' and 'show c'; they could even be > +mouse-clicks or menu items--whatever suits your program. > + > +You should also get your employer (if you work as a programmer) or your > +school, if any, to sign a "copyright disclaimer" for the program, if > +necessary. Here is a sample; alter the names: > + > +Yoyodyne, Inc., hereby disclaims all copyright interest in the program > +'Gnomovision' (which makes passes at compilers) written by James Hacker. > + > +signature of Ty Coon, 1 April 1989 > +Ty Coon, President of Vice > + > +This General Public License does not permit incorporating your program into > +proprietary programs. If your program is a subroutine library, you may > +consider it more useful to permit linking proprietary applications with the > +library. If this is what you want to do, use the GNU Library General Public > +License instead of this License. > diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/Makefile b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/Makefile > new file mode 100644 > index 000000000000..8fc23718bcf1 > --- /dev/null > +++ b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/Makefile > @@ -0,0 +1,3 @@ > +obj-$(CONFIG_HYPERV_INFINIBAND_ND) += hv_network_direct_142.0.o > + > +hv_network_direct_142.0-y := provider.o vmbus_rdma.o hvnd_addr.o > diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/hvnd_addr.c b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/hvnd_addr.c > new file mode 100644 > index 000000000000..c8c2295b63af > --- /dev/null > +++ b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/hvnd_addr.c > @@ -0,0 +1,136 @@ > +/* > + * Copyright (c) 2014, Microsoft Corporation. > + * > + * Author: > + * K. Y. Srinivasan <kys@microsoft.com> > + * > + * This program is free software; you can redistribute it and/or modify it > + * under the terms of the GNU General Public License version 2 as published > + * by the Free Software Foundation. > + * > + * This program is distributed in the hope that it will be useful, but > + * WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or > + * NON INFRINGEMENT. See the GNU General Public License for more > + * details. > + * > + * Bug fixes/enhancements: Long Li <longli@microsoft.com> > + */ > + > +#include <linux/completion.h> > +#include <linux/module.h> > +#include <linux/errno.h> > +#include <linux/hyperv.h> > +#include <linux/efi.h> > +#include <linux/slab.h> > +#include <linux/cred.h> > +#include <linux/sched.h> > +#include <linux/types.h> > +#include <linux/completion.h> > +#include <linux/scatterlist.h> > +#include <rdma/ib_umem.h> > +#include <rdma/ib_addr.h> > + > +#include "vmbus_rdma.h" > + > + > +#include <linux/semaphore.h> > +#include <linux/fs.h> > +#include <linux/nls.h> > +#include <linux/workqueue.h> > +#include <linux/cdev.h> > +#include <linux/hyperv.h> > +#include <linux/sched.h> > +#include <linux/uaccess.h> > +#include <linux/miscdevice.h> > +#include <linux/hyperv.h> > + > + > +int hvnd_get_outgoing_rdma_addr(struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx, > + union nd_sockaddr_inet *og_addr) > +{ > + int ret; > + /* > + * Query the host and select the first address. > + */ > + struct pkt_query_addr_list pkt; > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + > + hvnd_init_hdr(&pkt.hdr, > + (sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1)), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST, 0, 0, 0); > + > + pkt.ioctl.in.version = ND_VERSION_1; > + pkt.ioctl.in.reserved = 0; > + pkt.ioctl.in.handle = uctx->adaptor_hdl; > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, > + sizeof(pkt), (u64)&pkt); > + > + if (ret) > + return ret; > + > + /* > + * Copy the address out. > + */ > + > + memcpy(og_addr, &pkt.ioctl.out[0], sizeof(*og_addr)); > + return 0; > + > +} > + > +static struct rdma_addr_client self; > + > +struct resolve_cb_context { > + struct rdma_dev_addr *addr; > + struct completion comp; > +}; > + > +void hvnd_addr_init(void) > +{ > + rdma_addr_register_client(&self); > +} > + > +void hvnd_addr_deinit(void) > +{ > + rdma_addr_unregister_client(&self); > +} > + > +static void resolve_cb(int status, struct sockaddr *src_addr, > + struct rdma_dev_addr *addr, void *context) > +{ > + memcpy(((struct resolve_cb_context *)context)->addr, addr, sizeof(struct > + rdma_dev_addr)); > + complete(&((struct resolve_cb_context *)context)->comp); > +} > + > +int hvnd_get_neigh_mac_addr(struct sockaddr *local, struct sockaddr *remote, > + char *mac_addr) > +{ > + struct rdma_dev_addr dev_addr; > + struct resolve_cb_context ctx; > + int ret; > + > + memset(&dev_addr, 0, sizeof(dev_addr)); > + dev_addr.net = &init_net; > + ctx.addr = &dev_addr; > + init_completion(&ctx.comp); > + > + ret = rdma_resolve_ip(&self, local, remote, &dev_addr, 1000, > + resolve_cb, &ctx); > + > + if (ret) { > + hvnd_error("rdma_resolve_ip failed ret=%d\n", ret); > + return ret; > + } > + > + wait_for_completion(&ctx.comp); > + memcpy(mac_addr, dev_addr.dst_dev_addr, ETH_ALEN); > + return ret; > +} > diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/mx_abi.h b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/mx_abi.h > new file mode 100644 > index 000000000000..8f55cb408de0 > --- /dev/null > +++ b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/mx_abi.h > @@ -0,0 +1,232 @@ > +/* > + * Copyright (c) 2005 Topspin Communications. All rights reserved. > + * Copyright (c) 2005 Cisco Systems. All rights reserved. > + * Copyright (c) 2005 PathScale, Inc. All rights reserved. > + * > + * This software is available to you under a choice of one of two > + * licenses. You may choose to be licensed under the terms of the GNU > + * General Public License (GPL) Version 2, available from the file > + * COPYING in the main directory of this source tree, or the > + * OpenIB.org BSD license below: > + * > + * Redistribution and use in source and binary forms, with or > + * without modification, are permitted provided that the following > + * conditions are met: > + * > + * - Redistributions of source code must retain the above > + * copyright notice, this list of conditions and the following > + * disclaimer. > + * > + * - Redistributions in binary form must reproduce the above > + * copyright notice, this list of conditions and the following > + * disclaimer in the documentation and/or other materials > + * provided with the distribution. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, > + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF > + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND > + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS > + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN > + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN > + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE > + * SOFTWARE. > + * > + * KYS: made some modifications. > + */ > + > +#ifndef MX_ABI_H > +#define MX_ABI_H > + > + > + > +/* > + * Make sure that all structs defined in this file remain laid out so > + * that they pack the same way on 32-bit and 64-bit architectures (to > + * avoid incompatibility between 32-bit userspace and 64-bit kernels). > + * Specifically: > + * - Do not use pointer types -- pass pointers in UINT64 instead. > + * - Make sure that any structure larger than 4 bytes is padded to a > + * multiple of 8 bytes. Otherwise the structure size will be > + * different between 32-bit and 64-bit architectures. > + */ > + > +enum ibv_get_context_mappings { > + IBV_GET_CONTEXT_UAR, > + IBV_GET_CONTEXT_BF, > + IBV_GET_CONTEXT_MAPPING_MAX > +}; > + > +struct ibv_get_context_req { > + > + union nd_mapping mappings[IBV_GET_CONTEXT_MAPPING_MAX]; > +}; > + > +struct ibv_get_context_resp { > + > + // mmap UAR and BF > + struct nd_mapping_result mapping_results[IBV_GET_CONTEXT_MAPPING_MAX]; > + > + // mmap Blue Flame > + int bf_buf_size; > + int bf_offset; > + > + // mlx4_query_device result > + int max_qp_wr; > + int max_sge; > + int max_cqe; > + > + // general parameters > + u32 cqe_size; > + u32 vend_id; > + u16 dev_id; > + u16 bf_reg_size; > + u16 bf_regs_per_page; > + u16 reserved1; > + > + // ibv_cmd_get_context result > + u32 qp_tab_size; > + > + u32 reserved2; > +}; > + > +struct ibv_alloc_pd_resp { > + u64 pd_handle; > + u32 pdn; > + u32 reserved; > +}; > + > +struct ibv_reg_mr { > + u64 start; > + u64 length; > + u64 hca_va; > + u32 access_flags; > + u32 pdn; > + u64 pd_handle; > +}; > + > +struct ibv_reg_mr_resp { > + u64 mr_handle; > + u32 lkey; > + u32 rkey; > +}; > + > + > +enum mlx4_ib_create_cq_mapping { > + MLX4_IB_CREATE_CQ_BUF, > + MLX4_IB_CREATE_CQ_DB, > + MLX4_IB_CREATE_CQ_ARM_SN, // Windows specific > + MLX4_IB_CREATE_CQ_MAPPING_MAX > +}; > + > +#define MLX4_CQ_FLAGS_ARM_IN_KERNEL 1 > + > +struct ibv_create_cq { > + union nd_mapping mappings[MLX4_IB_CREATE_CQ_MAPPING_MAX]; > + u32 flags; > +}; > + > +struct ibv_create_cq_resp { > + struct nd_mapping_result mapping_results[MLX4_IB_CREATE_CQ_MAPPING_MAX]; > + u32 cqn; > + u32 cqe; > +}; > + > +enum mlx4_ib_create_srq_mappings { > + MLX4_IB_CREATE_SRQ_BUF, > + MLX4_IB_CREATE_SRQ_DB, > + MLX4_IB_CREATE_SRQ_MAPPINGS_MAX > +}; > + > +struct ibv_create_srq { > + union nd_mapping mappings[MLX4_IB_CREATE_SRQ_MAPPINGS_MAX]; > +}; > + > +struct ibv_create_srq_resp { > + struct nd_mapping_result mapping_results[MLX4_IB_CREATE_SRQ_MAPPINGS_MAX]; > +}; > + > +enum mlx4_ib_create_qp_mappings { > + MLX4_IB_CREATE_QP_BUF, > + MLX4_IB_CREATE_QP_DB, > + MLX4_IB_CREATE_QP_MAPPINGS_MAX > +}; > + > +struct ibv_create_qp { > + union nd_mapping mappings[MLX4_IB_CREATE_QP_MAPPINGS_MAX]; > + u8 log_sq_bb_count; > + u8 log_sq_stride; > + u8 sq_no_prefetch; > + u8 reserved; > +}; > + > +struct ibv_create_qp_resp { > + struct nd_mapping_result mapping_results[MLX4_IB_CREATE_QP_MAPPINGS_MAX]; > + // struct ib_uverbs_create_qp_resp > + u64 qp_handle; > + u32 qpn; > + u32 max_send_wr; > + u32 max_recv_wr; > + u32 max_send_sge; > + u32 max_recv_sge; > + u32 max_inline_data; > +}; > + > +enum ibv_qp_attr_mask { > + IBV_QP_STATE = 1 << 0, > + IBV_QP_CUR_STATE = 1 << 1, > + IBV_QP_EN_SQD_ASYNC_NOTIFY = 1 << 2, > + IBV_QP_ACCESS_FLAGS = 1 << 3, > + IBV_QP_PKEY_INDEX = 1 << 4, > + IBV_QP_PORT = 1 << 5, > + IBV_QP_QKEY = 1 << 6, > + IBV_QP_AV = 1 << 7, > + IBV_QP_PATH_MTU = 1 << 8, > + IBV_QP_TIMEOUT = 1 << 9, > + IBV_QP_RETRY_CNT = 1 << 10, > + IBV_QP_RNR_RETRY = 1 << 11, > + IBV_QP_RQ_PSN = 1 << 12, > + IBV_QP_MAX_QP_RD_ATOMIC = 1 << 13, > + IBV_QP_ALT_PATH = 1 << 14, > + IBV_QP_MIN_RNR_TIMER = 1 << 15, > + IBV_QP_SQ_PSN = 1 << 16, > + IBV_QP_MAX_DEST_RD_ATOMIC = 1 << 17, > + IBV_QP_PATH_MIG_STATE = 1 << 18, > + IBV_QP_CAP = 1 << 19, > + IBV_QP_DEST_QPN = 1 << 20 > +}; > + > +enum ibv_qp_state { > + IBV_QPS_RESET, > + IBV_QPS_INIT, > + IBV_QPS_RTR, > + IBV_QPS_RTS, > + IBV_QPS_SQD, > + IBV_QPS_SQE, > + IBV_QPS_ERR > +}; > + > + > +struct ibv_modify_qp_resp { > + enum ibv_qp_attr_mask attr_mask; > + u8 qp_state; > + u8 reserved[3]; > +}; > + > +struct ibv_create_ah_resp { > + u64 start; > +}; > + > +/* > + * Some mlx4 specific kernel definitions. Perhaps could be in > + * separate file. > + */ > + > +struct mlx4_ib_user_db_page { > + struct list_head list; > + struct ib_umem *umem; > + unsigned long user_virt; > + int refcnt; > +}; > + > + > +#endif /* MX_ABI_H */ > diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/provider.c b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/provider.c > new file mode 100644 > index 000000000000..7115cf769fe0 > --- /dev/null > +++ b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/provider.c > @@ -0,0 +1,2961 @@ > +/* > + * Copyright (c) 2014, Microsoft Corporation. > + * > + * Author: > + * K. Y. Srinivasan <kys@microsoft.com> > + * > + * This program is free software; you can redistribute it and/or modify it > + * under the terms of the GNU General Public License version 2 as published > + * by the Free Software Foundation. > + * > + * This program is distributed in the hope that it will be useful, but > + * WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or > + * NON INFRINGEMENT. See the GNU General Public License for more > + * details. > + * > + * Bug fixes/enhancements: Long Li <longli@microsoft.com> > + */ > + > +#include <linux/module.h> > +#include <linux/moduleparam.h> > +#include <linux/device.h> > +#include <linux/netdevice.h> > +#include <linux/etherdevice.h> > +#include <linux/delay.h> > +#include <linux/errno.h> > +#include <linux/list.h> > +#include <linux/spinlock.h> > +#include <linux/ethtool.h> > +#include <linux/rtnetlink.h> > +#include <linux/inetdevice.h> > +#include <linux/io.h> > +#include <linux/hyperv.h> > +#include <linux/completion.h> > +#include <rdma/iw_cm.h> > +#include <rdma/ib_verbs.h> > +#include <rdma/ib_smi.h> > +#include <rdma/ib_umem.h> > +#include <rdma/ib_user_verbs.h> > +#include <rdma/rdma_cm.h> > +#include <linux/proc_fs.h> > +#include <linux/seq_file.h> > +#include "vmbus_rdma.h" > + > +/* > + * We are emulating mlx4. XXXKYS: May have to FIX. > + */ > +#include "./user.h" > + > +static struct hvnd_dev *g_nd_dev; // the one and only one > + > +int hvnd_log_level = HVND_ERROR; > +module_param(hvnd_log_level, int, 0644); > +MODULE_PARM_DESC(hvnd_log_level, > + "Logging level, 0 - Error (default), 1 - Warning, 2 - Info, 3 - Debug."); > + > +static int disable_cq_notify = 1; > +//static int disable_cq_notify = 0; > +module_param(disable_cq_notify, int, 0644); > +MODULE_PARM_DESC(disable_cq_notify, > + "Disable CQ notification, 0 - Enable, 1 - Disable (default)."); > + > +enum { > + MLX4_USER_DEV_CAP_64B_CQE = 1L << 0 > +}; > + > +#define HVND_NODE_DESC "vmbus-RDMA" > + > +#undef MLX4_IB_UVERBS_ABI_VERSION > +#define MLX4_IB_UVERBS_ABI_VERSION 4 > + > +struct mlx4_wqe_data_seg { > + __be32 byte_count; > + __be32 lkey; > + __be64 addr; > +}; > + > +/* return value: > + * true: ep is running > + * false: ep is stopped > + */ > +bool ep_add_work_pending(struct hvnd_ep_obj *ep_object) > +{ > + bool ret = true; > + > + atomic_inc(&ep_object->nr_requests_pending); > + if (ep_object->stopping) { > + if (atomic_dec_and_test(&ep_object->nr_requests_pending)) > + wake_up(&ep_object->wait_pending); > + ret = false; > + } > + return ret; > +} > + > +void ep_del_work_pending(struct hvnd_ep_obj *ep_object) > +{ > + if (atomic_dec_and_test(&ep_object->nr_requests_pending)) > + wake_up(&ep_object->wait_pending); > + > + if (atomic_read(&ep_object->nr_requests_pending) < 0) { > + hvnd_error("ep_object->nr_requests_pending=%d type=%d cm_state=%d\n", > + atomic_read(&ep_object->nr_requests_pending), > + ep_object->type, ep_object->cm_state); > + dump_stack(); > + } > +} > + > +void ep_stop(struct hvnd_ep_obj *ep_object) > +{ > + if (!ep_object->stopping) { > + ep_object->stopping = true; > + hvnd_cancel_io(ep_object); > + } > + > + if (atomic_read(&ep_object->nr_requests_pending) < 0) { > + hvnd_error("IO canceled, ep_object->nr_requests_pending=%d type=%d cm_state=%d\n", > + atomic_read(&ep_object->nr_requests_pending), > + ep_object->type, ep_object->cm_state); > + dump_stack(); > + } > + > + wait_event(ep_object->wait_pending, > + !atomic_read(&ep_object->nr_requests_pending)); > +} > + > +static int vmbus_dma_map_sg(struct device *dev, struct scatterlist *sgl, > + int nents, enum dma_data_direction direction, > + unsigned long attrs) > +{ > + struct scatterlist *sg; > + u64 addr; > + int i; > + int ret = nents; > + > + BUG_ON(!valid_dma_direction(direction)); > + > + for_each_sg(sgl, sg, nents, i) { > + addr = (u64) page_address(sg_page(sg)); > + /* TODO: handle highmem pages */ > + if (!addr) { > + ret = 0; > + break; > + } > + sg->dma_address = addr + sg->offset; > + sg->dma_length = sg->length; > + } > + return ret; > +} > + > +static void vmbus_dma_unmap_sg(struct device *dev, > + struct scatterlist *sg, int nents, > + enum dma_data_direction direction, unsigned long attrs) > +{ > + BUG_ON(!valid_dma_direction(direction)); > +} > + > + > +const struct dma_map_ops vmbus_dma_ops = { > + .map_sg = vmbus_dma_map_sg, > + .unmap_sg = vmbus_dma_unmap_sg, > +}; > + > +static int hvnd_get_incoming_connections(struct hvnd_ep_obj *listener, > + struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx); > + > +static struct hvnd_ep_obj *hvnd_setup_ep(struct iw_cm_id *cm_id, int ep_type, > + struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx); > + > +static void hvnd_deinit_ep(struct hvnd_ep_obj *ep) > +{ > + put_irp_handle(ep->nd_dev, ep->local_irp); > +} > + > +static void hvnd_destroy_ep(struct hvnd_ep_obj *ep) > +{ > + hvnd_debug("canceling work for ep %p\n", ep); > + cancel_work_sync(&ep->wrk.work); > + hvnd_deinit_ep(ep); > + kfree(ep); > +} > + > + > +#define UC(b) (((int)b)&0xff) > +char *debug_inet_ntoa(struct in_addr in, char *b) > +{ > + register char *p; > + > + p = (char *)∈ > + (void)snprintf(b, 20, > + "%d.%d.%d.%d", UC(p[0]), UC(p[1]), UC(p[2]), UC(p[3])); > + return b; > +} > + > +void hvnd_process_events(struct work_struct *work); > + > +static int hvnd_init_ep(struct hvnd_ep_obj *ep_object, > + struct iw_cm_id *cm_id, int ep_type, > + struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx) > +{ > + int ret; > + > + ep_object->type = ep_type; > + ep_object->cm_id = cm_id; > + ep_object->nd_dev = nd_dev; > + ep_object->uctx = uctx; > + > + ep_object->parent = NULL; > + > + ep_object->wrk.callback_arg = ep_object; > + INIT_WORK(&ep_object->wrk.work, hvnd_process_events); > + INIT_LIST_HEAD(&ep_object->incoming_pkt_list); > + spin_lock_init(&ep_object->incoming_pkt_list_lock); > + > + ep_object->stopping = false; > + atomic_set(&ep_object->nr_requests_pending, 0); > + init_waitqueue_head(&ep_object->wait_pending); > + > + ret = get_irp_handle(nd_dev, &ep_object->local_irp, (void *)ep_object); > + > + if (ret) { > + hvnd_error("get_irp_handle() failed: err: %d\n", ret); > + return ret; > + } > + return 0; > + > +} > + > +static int set_rq_size(struct hvnd_dev *dev, struct ib_qp_cap *cap, > + struct hvnd_qp *qp) > +{ > + > + /* HW requires >= 1 RQ entry with >= 1 gather entry */ > + if (!cap->max_recv_wr || !cap->max_recv_sge) > + return -EINVAL; > + > + qp->rq_wqe_cnt = roundup_pow_of_two(max(1U, cap->max_recv_wr)); > + qp->rq_max_gs = roundup_pow_of_two(max(1U, cap->max_recv_sge)); > + qp->rq_wqe_shift = ilog2( > + qp->rq_max_gs * sizeof(struct mlx4_wqe_data_seg)); > + > + > + return 0; > +} > + > +static int set_user_sq_size(struct hvnd_dev *dev, > + struct hvnd_qp *qp, > + struct mlx4_ib_create_qp *ucmd) > +{ > + qp->sq_wqe_cnt = 1 << ucmd->log_sq_bb_count; > + qp->sq_wqe_shift = ucmd->log_sq_stride; > + > + qp->buf_size = (qp->rq_wqe_cnt << qp->rq_wqe_shift) + > + (qp->sq_wqe_cnt << qp->sq_wqe_shift); > + > + return 0; > +} > + > +static int hvnd_db_map_user(struct hvnd_ucontext *uctx, unsigned long virt, > + struct ib_umem **db_umem) > +{ > + struct mlx4_ib_user_db_page *page; > + int err = 0; > + > + mutex_lock(&uctx->db_page_mutex); > + > + list_for_each_entry(page, &uctx->db_page_list, list) > + if (page->user_virt == (virt & PAGE_MASK)) > + goto found; > + > + page = kmalloc(sizeof(*page), GFP_KERNEL); > + if (!page) { > + err = -ENOMEM; > + goto out; > + } > + > + page->user_virt = (virt & PAGE_MASK); > + page->refcnt = 0; > + page->umem = ib_umem_get(&uctx->ibucontext, virt & PAGE_MASK, > + PAGE_SIZE, 0, 0); > + if (IS_ERR(page->umem)) { > + hvnd_error("ib_umem_get failure\n"); > + err = PTR_ERR(page->umem); > + kfree(page); > + goto out; > + } > + > + list_add(&page->list, &uctx->db_page_list); > + > +found: > + ++page->refcnt; > +out: > + mutex_unlock(&uctx->db_page_mutex); > + if (!err) > + *db_umem = page->umem; > + > + return err; > +} > + > +static void hvnd_db_unmap_user(struct hvnd_ucontext *uctx, u64 db_addr) > +{ > + struct mlx4_ib_user_db_page *page; > + > + mutex_lock(&uctx->db_page_mutex); > + list_for_each_entry(page, &uctx->db_page_list, list) > + if (page->user_virt == (db_addr & PAGE_MASK)) > + goto found; > + > +found: > + if (!--page->refcnt) { > + list_del(&page->list); > + ib_umem_release(page->umem); > + kfree(page); > + } > + > + mutex_unlock(&uctx->db_page_mutex); > +} > + > + > +static void debug_check(const char *func, int line) > +{ > + hvnd_debug("func is: %s; line is %d\n", func, line); > + > + if (in_interrupt()) { > + hvnd_error("In interrupt func is: %s; line is %d\n", > + func, line); > + return; > + } > +} > + > +static struct ib_ah *hvnd_ah_create(struct ib_pd *pd, > + struct ib_ah_attr *ah_attr, > + struct ib_udata *udata) > +{ > + debug_check(__func__, __LINE__); > + return ERR_PTR(-ENOSYS); > +} > + > +static int hvnd_ah_destroy(struct ib_ah *ah) > +{ > + debug_check(__func__, __LINE__); > + return -ENOSYS; > +} > + > +static int hvnd_multicast_attach(struct ib_qp *ibqp, union ib_gid *gid, u16 lid) > +{ > + debug_check(__func__, __LINE__); > + return -ENOSYS; > +} > + > +static int hvnd_multicast_detach(struct ib_qp *ibqp, union ib_gid *gid, u16 lid) > +{ > + debug_check(__func__, __LINE__); > + return -ENOSYS; > +} > + > +static int hvnd_process_mad(struct ib_device *ibdev, > + int mad_flags, > + u8 port_num, > + const struct ib_wc *in_wc, > + const struct ib_grh *in_grh, > + const struct ib_mad_hdr *in_mad, > + size_t in_mad_size, > + struct ib_mad_hdr *out_mad, > + size_t *out_mad_size, > + u16 *out_mad_pkey_index) > +{ > + debug_check(__func__, __LINE__); > + return -ENOSYS; > +} > + > +void hvnd_acquire_uctx_ref(struct hvnd_ucontext *uctx) > +{ > + atomic_inc(&uctx->refcnt); > +} > + > +void hvnd_drop_uctx_ref(struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx) > +{ > + if (atomic_dec_and_test(&uctx->refcnt)) { > + hvnd_debug("uctx ref cnt dropped it is %d\n", > + atomic_read(&uctx->refcnt)); > + hvnd_debug("About to close adaptor\n"); > + hvnd_close_adaptor(nd_dev, uctx); > + } else > + hvnd_debug("uctx ref cnt dropped it is %d\n", > + atomic_read(&uctx->refcnt)); > +} > + > + > +static int hvnd_dealloc_ucontext(struct ib_ucontext *context) > +{ > + struct hvnd_dev *nd_dev; > + struct hvnd_ucontext *uctx; > + > + uctx = to_nd_context(context); > + nd_dev = to_nd_dev(context->device); > + > + hvnd_debug("calling %s\n", __func__); > + > + hvnd_drop_uctx_ref(nd_dev, uctx); > + > + return 0; > +} > + > +static struct ib_ucontext *hvnd_alloc_ucontext(struct ib_device *ibdev, > + struct ib_udata *udata) > +{ > + struct hvnd_dev *nd_dev = to_nd_dev(ibdev); > + struct hvnd_ucontext *uctx; > + struct mlx4_ib_alloc_ucontext_resp resp; > + struct ibv_get_context_resp *ctx_output; > + int ret; > + > + if (!nd_dev->ib_active) { > + hvnd_error("ib device is not active, try again\n"); > + return ERR_PTR(-EAGAIN); > + } > + > + uctx = get_uctx(nd_dev, current_pid()); > + if (uctx) { > + // it is already opened, just increase its reference count > + hvnd_acquire_uctx_ref(uctx); > + } else { > + > + /* > + * The Windows host expects the following to be done: > + * 1. Successfully send struct ndv_pkt_hdr_create_1 > + * 2. INIT PROVIDER > + * 3. Open Adapter > + * Before we can complete this call. > + */ > + > + uctx = kzalloc(sizeof(struct hvnd_ucontext), GFP_KERNEL); > + if (!uctx) > + return ERR_PTR(-ENOMEM); > + > + atomic_set(&uctx->refcnt, 1); > + INIT_LIST_HEAD(&uctx->db_page_list); > + mutex_init(&uctx->db_page_mutex); > + > + /* > + * Stash away the context with the calling PID. > + */ > + ret = insert_handle(nd_dev, &nd_dev->uctxidr, uctx, > + current_pid()); > + if (ret) { > + hvnd_error("Uctx ID insertion failed; ret is %d\n", > + ret); > + goto err1; > + } > + > + hvnd_debug("Opening adaptor pid is %d\n", current_pid()); > + > + ret = hvnd_open_adaptor(nd_dev, uctx); > + if (ret) { > + hvnd_error("hvnd_open_adaptor failed ret=%d\n", ret); > + goto err1; > + } > + > + } > + > + /* > + * Copy the response out. > + */ > + ctx_output = &uctx->o_adap_pkt.mappings.ctx_output; > + resp.dev_caps = MLX4_USER_DEV_CAP_64B_CQE; > + resp.qp_tab_size = ctx_output->qp_tab_size; > + resp.bf_reg_size = ctx_output->bf_reg_size; > + resp.bf_regs_per_page = ctx_output->bf_regs_per_page; > + resp.cqe_size = ctx_output->cqe_size; > + > + ret = ib_copy_to_udata(udata, &resp, sizeof(resp)); > + if (ret) { > + hvnd_error("ib_copy_to_udata failed ret=%d\n", ret); > + goto err1; > + } > + > + return &uctx->ibucontext; > + > +err1: > + kfree(uctx); > + return ERR_PTR(ret); > +} > + > +static int hvnd_mmap(struct ib_ucontext *context, struct vm_area_struct *vma) > +{ > + struct hvnd_ucontext *uctx = to_nd_context(context); > + > + if (vma->vm_end - vma->vm_start != PAGE_SIZE) { > + hvnd_error("vma not a page size, actual size=%lu\n", > + vma->vm_end - vma->vm_start); > + return -EINVAL; > + } > + > + if (vma->vm_pgoff == 0) { > + vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); > + > + if (io_remap_pfn_range(vma, vma->vm_start, > + (uctx->uar_base >> PAGE_SHIFT), > + PAGE_SIZE, vma->vm_page_prot)) { > + hvnd_error("io_remap_pfn_range failure\n"); > + return -EAGAIN; > + } > + } else if (vma->vm_pgoff == 1 && uctx->bf_buf_size != 0) { > + vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot); > + > + if (io_remap_pfn_range(vma, vma->vm_start, > + (uctx->uar_base >> PAGE_SHIFT) + 1, > + PAGE_SIZE, vma->vm_page_prot)) { > + hvnd_error("io_remap_pfn_range failure\n"); > + return -EAGAIN; > + } > + } else { > + hvnd_error("check code\n"); > + return -EINVAL; > + } > + > + return 0; > +} > + > +static int hvnd_deallocate_pd(struct ib_pd *pd) > +{ > + > + struct hvnd_ucontext *uctx; > + struct hvnd_dev *nd_dev; > + struct hvnd_ib_pd *hvnd_pd; > + struct ib_ucontext *ibuctx = pd->uobject->context; > + > + hvnd_pd = to_nd_pd(pd); > + nd_dev = to_nd_dev(pd->device); > + uctx = to_nd_context(ibuctx); > + > + hvnd_free_handle(nd_dev, uctx, hvnd_pd->handle, > + IOCTL_ND_PD_FREE); > + > + hvnd_drop_uctx_ref(nd_dev, uctx); > + return 0; > +} > + > +static struct ib_pd *hvnd_allocate_pd(struct ib_device *ibdev, > + struct ib_ucontext *context, > + struct ib_udata *udata) > +{ > + struct hvnd_ucontext *uctx; > + struct hvnd_dev *nd_dev; > + int ret; > + struct hvnd_ib_pd *hvnd_pd; > + > + if (!context) { > + hvnd_error("kernel mode context not supported\n"); > + return ERR_PTR(-EINVAL); > + } > + > + hvnd_pd = kzalloc(sizeof(struct hvnd_ib_pd), GFP_KERNEL); > + > + if (!hvnd_pd) > + return ERR_PTR(-ENOMEM); > + > + uctx = to_nd_context(context); > + nd_dev = to_nd_dev(ibdev); > + > + ret = hvnd_create_pd(uctx, nd_dev, hvnd_pd); > + if (ret) { > + hvnd_error("hvnd_create_pd failure ret=%d\n", ret); > + goto error_cr_pd; > + } > + > + if (context) { > + if (ib_copy_to_udata(udata, &hvnd_pd->pdn, sizeof(__u32))) { > + hvnd_error("ib_copy_to_udata failure\n"); > + ret = -EFAULT; > + goto error_fault; > + } > + } > + > + hvnd_acquire_uctx_ref(uctx); > + return &hvnd_pd->ibpd; > + > +error_fault: > + hvnd_free_handle(nd_dev, uctx, hvnd_pd->handle, > + IOCTL_ND_PD_FREE); > + > +error_cr_pd: > + kfree(hvnd_pd); > + return ERR_PTR(ret); > +} > + > +static int hvnd_query_pkey(struct ib_device *ibdev, u8 port, u16 index, > + u16 *pkey) > +{ > + debug_check(__func__, __LINE__); > + *pkey = 0; > + return 0; > +} > + > +static int hvnd_query_gid(struct ib_device *ibdev, u8 port, int index, > + union ib_gid *gid) > +{ > + int ret; > + struct hvnd_dev *nd_dev = to_nd_dev(ibdev); > + > + debug_check(__func__, __LINE__); > + > + ret = wait_for_completion_timeout(&nd_dev->addr_set, 60*HZ); > + if (!ret) > + return -ETIMEDOUT; > + > + memset(&(gid->raw[0]), 0, sizeof(gid->raw)); > + memcpy(&(gid->raw[0]), nd_dev->mac_addr, 6); > + return 0; > +} > + > +static int hvnd_query_device(struct ib_device *ibdev, > + struct ib_device_attr *props, > + struct ib_udata *udata) > +{ > + struct hvnd_dev *nd_dev = to_nd_dev(ibdev); > + struct adapter_info_v2 *adap_info; > + > + if (!nd_dev->query_pkt_set) { > + hvnd_error("query packet not received yet\n"); > + return 0; > + } > + > + adap_info = &nd_dev->query_pkt.ioctl.ad_info; > + > + memset(props, 0, sizeof(*props)); > + > + /* > + * Copy the relevant properties out. > + */ > + props->fw_ver = 0; > + props->device_cap_flags = 0; > + //props->device_cap_flags |= IB_DEVICE_BAD_PKEY_CNTR; > + //props->device_cap_flags |= IB_DEVICE_BAD_QKEY_CNTR; > + //props->device_cap_flags |= IB_DEVICE_XRC; > + > + props->vendor_id = 0x15b3; > + props->vendor_part_id = adap_info->device_id; > + > + props->max_mr_size = ~0ull; > + props->page_size_cap = PAGE_SIZE; > + props->max_qp = 16384; > + props->max_qp_wr = min(adap_info->max_recv_q_depth, > + adap_info->max_initiator_q_depth); > + > + props->max_sge = min(adap_info->max_initiator_sge, > + adap_info->max_recv_sge); > + props->max_cq = 0x1FFFF; > + props->max_cqe = adap_info->max_completion_q_depth; > + props->max_mr = 16384; > + props->max_pd = 16384; > + > + props->max_qp_rd_atom = adap_info->max_inbound_read_limit; > + props->max_qp_init_rd_atom = adap_info->max_outbound_read_limit; > + props->max_res_rd_atom = props->max_qp_rd_atom * props->max_qp; > + props->max_srq = 16384; > + props->max_srq_wr = adap_info->max_recv_q_depth; > + props->max_srq_sge = adap_info->max_recv_sge; > + > + return 0; > +} > + > +static int hvnd_query_port(struct ib_device *ibdev, u8 port, > + struct ib_port_attr *props) > +{ > + memset(props, 0, sizeof(struct ib_port_attr)); > + > + props->max_mtu = IB_MTU_4096; > + props->active_mtu = IB_MTU_4096; > + > + /* > + * KYS: TBD need to base this on netdev. > + */ > + props->state = IB_PORT_ACTIVE; > + > + props->port_cap_flags = IB_PORT_CM_SUP; > + > + props->gid_tbl_len = 1; > + props->pkey_tbl_len = 1; > + props->active_width = 1; > + props->active_speed = IB_SPEED_DDR; //KYS: check > + props->max_msg_sz = -1; > + > + return 0; > +} > + > +static enum rdma_link_layer > +hvnd_get_link_layer(struct ib_device *device, u8 port) > +{ > + return IB_LINK_LAYER_ETHERNET; > +} > + > +static ssize_t hvnd_show_rev(struct device *dev, struct device_attribute *attr, > + char *buf) > +{ > + return 0; > +} > + > +static ssize_t hvnd_show_fw_ver(struct device *dev, > + struct device_attribute *attr, char *buf) > +{ > + return 0; > +} > + > +static ssize_t hvnd_show_hca(struct device *dev, struct device_attribute *attr, > + char *buf) > +{ > + return 0; > +} > + > +static ssize_t hvnd_show_board(struct device *dev, > + struct device_attribute *attr, char *buf) > +{ > + return 0; > +} > + > +static int hvnd_get_port_immutable(struct ib_device *ibdev, u8 port_num, > + struct ib_port_immutable *immutable) > +{ > + struct ib_port_attr attr; > + int err; > + > + err = hvnd_query_port(ibdev, port_num, &attr); > + if (err) > + return err; > + > + immutable->pkey_tbl_len = attr.pkey_tbl_len; > + immutable->gid_tbl_len = attr.gid_tbl_len; > + immutable->core_cap_flags = RDMA_CORE_PORT_IWARP; > + > + return 0; > +} > + > +static struct ib_qp *hvnd_ib_create_qp(struct ib_pd *pd, > + struct ib_qp_init_attr *attrs, > + struct ib_udata *udata) > +{ > + struct hvnd_ucontext *uctx; > + struct hvnd_dev *nd_dev; > + struct mlx4_ib_create_qp ucmd; > + struct hvnd_qp *qp; > + int ret = 0; > + struct hvnd_ib_pd *hvnd_pd = to_nd_pd(pd); > + struct hvnd_cq *send_cq = to_nd_cq(attrs->send_cq); > + struct hvnd_cq *recv_cq = to_nd_cq(attrs->recv_cq); > + > + uctx = get_uctx_from_pd(pd); > + nd_dev = to_nd_dev(pd->device); > + > + if (attrs->qp_type != IB_QPT_RC) { > + hvnd_error("attrs->qp_type=%d not IB_QPT_RC\n", attrs->qp_type); > + return ERR_PTR(-EINVAL); > + } > + > + qp = kzalloc(sizeof(*qp), GFP_KERNEL); > + if (!qp) { > + ret = -ENOMEM; > + goto err_done; > + } > + > + qp->uctx = uctx; > + > + if (ib_copy_from_udata(&ucmd, udata, sizeof(ucmd))) { > + hvnd_error("ib_copy_from_udata failed\n"); > + ret = -EFAULT; > + goto err_ucpy; > + } > + > + qp->qp_buf = (void *)ucmd.buf_addr; > + qp->db_addr = (void *)ucmd.db_addr; > + qp->log_sq_bb_count = ucmd.log_sq_bb_count; > + qp->log_sq_stride = ucmd.log_sq_stride; > + qp->sq_no_prefetch = ucmd.sq_no_prefetch; > + qp->port = attrs->port_num; > + > + init_waitqueue_head(&qp->wait); > + atomic_set(&qp->refcnt, 1); > + > + qp->recv_cq = recv_cq; > + qp->send_cq = send_cq; > + qp->nd_dev = nd_dev; > + > + qp->receive_cq_handle = recv_cq->cq_handle; > + qp->initiator_cq_handle = send_cq->cq_handle; > + qp->pd_handle = hvnd_pd->handle; > + qp->cq_notify = false; > + > + qp->ibqp.qp_num = attrs->qp_type == IB_QPT_SMI ? 0 : 1; > + > + qp->max_inline_data = attrs->cap.max_inline_data; > + > + qp->initiator_q_depth = attrs->cap.max_send_wr; > + qp->initiator_request_sge = attrs->cap.max_send_sge; > + > + > + qp->receive_q_depth = attrs->cap.max_recv_wr; > + qp->receive_request_sge = attrs->cap.max_recv_sge; > + > + set_rq_size(nd_dev, &attrs->cap, qp); > + > + set_user_sq_size(nd_dev, qp, &ucmd); > + > + qp->umem = ib_umem_get(&uctx->ibucontext, ucmd.buf_addr, > + qp->buf_size, 0, 0); > + if (IS_ERR(qp->umem)) { > + ret = PTR_ERR(qp->umem); > + hvnd_error("ib_umem_get failed ret=%d\n", ret); > + goto err_ucpy; > + } > + > + ret = hvnd_db_map_user(uctx, ucmd.db_addr, &qp->db_umem); > + > + if (ret) { > + hvnd_error("hvnd_db_map_user failed ret=%d\n", ret); > + goto err_db_map; > + } > + > + ret = hvnd_create_qp(nd_dev, uctx, qp); > + > + if (ret) { > + hvnd_error("hvnd_create_qp failed ret=%d\n", ret); > + goto err_qp; > + } > + > + hvnd_acquire_uctx_ref(uctx); > + > + qp->ibqp.qp_num = qp->qpn; > + qp->ibqp.qp_type = IB_QPT_RC; > + > + > + return &qp->ibqp; > + > +err_qp: > + hvnd_db_unmap_user(uctx, ucmd.db_addr); > + > +err_db_map: > + ib_umem_release(qp->umem); > + > +err_ucpy: > + kfree(qp); > +err_done: > + return ERR_PTR(ret); > +} > + > +static int hvnd_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, > + int attr_mask, struct ib_udata *udata) > +{ > + struct hvnd_qp *qp = to_nd_qp(ibqp); > + struct hvnd_dev *nd_dev = to_nd_dev(ibqp->device); > + enum ib_qp_state cur_state, new_state; > + int ret = 0; > + > + if (attr != NULL) { > + cur_state = (attr_mask & IB_QP_CUR_STATE) > + ? attr->cur_qp_state : qp->qp_state; > + new_state = (attr_mask & IB_QP_STATE) > + ? attr->qp_state : cur_state; > + > + hvnd_debug("qp->qp_state is %d new state is %d\n", > + qp->qp_state, new_state); > + hvnd_debug("current qp state is %d\n", cur_state); > + if (attr_mask & IB_QP_STATE) { > + /* Ensure the state is valid */ > + if (attr->qp_state < 0 || attr->qp_state > IB_QPS_ERR) { > + hvnd_error("incorrect qp state attr->qp_state=%d\n", > + attr->qp_state); > + return -EINVAL; > + } > + > + if (qp->qp_state != new_state) { > + qp->qp_state = new_state; > + /* > + * The only state transition supported is the > + * transition to error state. > + */ > + switch (new_state) { > + case IB_QPS_ERR: > + case IB_QPS_SQD: > + ret = hvnd_flush_qp(nd_dev, qp->uctx, > + qp); > + > + if (ret) > + hvnd_error("hvnd_flush_qp failed ret=%d\n", ret); > + > + // immediately notify the upper layer > + // on disconnection > + if (!ret && qp->connector) > + hvnd_process_notify_disconnect( > + qp->connector, > + STATUS_SUCCESS); > + > + return ret; > + > + default: > + break; > + } > + } > + } > + } > + return 0; > +} > + > + > +static int hvnd_ib_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, > + int attr_mask, struct ib_qp_init_attr *init_attr) > +{ > + struct hvnd_qp *qp = to_nd_qp(ibqp); > + > + memset(attr, 0, sizeof(*attr)); > + memset(init_attr, 0, sizeof(*init_attr)); > + > + attr->qp_state = qp->qp_state; > + > + init_attr->cap.max_send_wr = qp->max_send_wr; > + init_attr->cap.max_recv_wr = qp->max_recv_wr; > + > + init_attr->cap.max_send_sge = qp->max_send_sge; > + init_attr->cap.max_recv_sge = qp->max_recv_sge; > + init_attr->cap.max_inline_data = qp->max_inline_data; > + > + init_attr->sq_sig_type = IB_SIGNAL_ALL_WR; > + > + return 0; > +} > + > +static void hvnd_refuse_connection(struct hvnd_ep_obj *connector, int status); > +static int hvnd_destroy_qp(struct ib_qp *ib_qp) > +{ > + int ret; > + struct hvnd_qp *qp = to_nd_qp(ib_qp); > + struct hvnd_dev *nd_dev = to_nd_dev(ib_qp->device); > + u64 jiffies; > + > + if (!qp->connector) { > + hvnd_warn("error: connector is NULL; skip destroying connector\n"); > + goto free_qp; > + } > + > + /* > + * should we flush the qp first on ctrl-C?, > + * no need to disconnect on abrupt shutdown? > + */ > + if (qp->qp_state != IB_QPS_ERR && qp->qp_state != IB_QPS_SQD) { > + hvnd_warn("qp_state=%d, doing abrupt disconnect\n", > + qp->qp_state); > + hvnd_flush_qp(nd_dev, qp->uctx, qp); > + > + ep_stop(qp->connector); > + > + // now no pending activity is possible on the connector > + > + switch (qp->connector->cm_state) { > + > + case hvnd_cm_idle: > + case hvnd_cm_connect_reply_refused: > + case hvnd_cm_connect_request_sent: > + case hvnd_cm_close_sent: > + hvnd_warn("cm_state = %d not doing anything\n", > + qp->connector->cm_state); > + break; > + > + case hvnd_cm_connect_received: > + hvnd_warn("cm_state = %d refusing pending connection request\n", > + qp->connector->cm_state); > + hvnd_refuse_connection(qp->connector, -ECONNREFUSED); > + break; > + > + case hvnd_cm_connect_reply_sent: > + case hvnd_cm_established_sent: > + case hvnd_cm_accept_sent: > + hvnd_warn("cm_state = %d notifying disconnect on existing connection\n", > + qp->connector->cm_state); > + hvnd_process_notify_disconnect(qp->connector, > + STATUS_CANCELLED); > + break; > + > + default: > + hvnd_error("unknown cm_state = %d\n", > + qp->connector->cm_state); > + > + } > + goto free_connector; > + } else { > + hvnd_debug("qp_state=%d, doing normal disconnect\n", > + qp->qp_state); > + } > + > + if (!ep_add_work_pending(qp->connector)) > + goto free_connector; > + > + init_completion(&qp->connector->disconnect_event); > + > + /* > + * First issue a disconnect on the connector. > + */ > + > + hvnd_debug("calling hvnd_connector_disconnect\n"); > + ret = hvnd_connector_disconnect(nd_dev, qp->uctx, > + qp->connector->ep_handle, > + qp->connector); > + if (ret) { > + ep_del_work_pending(qp->connector); > + hvnd_error("disconnect: retval is %d\n", ret); > + ep_stop(qp->connector); > + goto free_connector; > + } > + /* > + * Now wait for the disconnect. > + */ > + jiffies = get_jiffies_64(); > + wait_for_completion(&qp->connector->disconnect_event); > + hvnd_debug("Completed disconnect connector=%p jiffies=%llu\n", > + qp->connector, get_jiffies_64() - jiffies); > + > + /* > + * Now free up the connector and drop the reference on uctx. > + */ > + > + ep_stop(qp->connector); > + > +free_connector: > + hvnd_debug("destroying connector handle: %p\n", > + (void *) qp->connector->ep_handle); > + hvnd_free_handle(nd_dev, qp->uctx, > + qp->connector->ep_handle, > + IOCTL_ND_CONNECTOR_FREE); > + > + hvnd_drop_uctx_ref(nd_dev, qp->uctx); > + hvnd_destroy_ep(qp->connector); > + qp->connector = NULL; > +free_qp: > + atomic_dec(&qp->refcnt); > + hvnd_debug("Waiting for the ref cnt to go to 0\n"); > + > + wait_event(qp->wait, !atomic_read(&qp->refcnt)); > + > + hvnd_debug("About to destroy qp\n"); > + hvnd_db_unmap_user(qp->uctx, (u64)qp->db_addr); > + ib_umem_release(qp->umem); > + > + hvnd_debug("About to free qp\n"); > + ret = hvnd_free_qp(nd_dev, qp->uctx, qp); > + > + if (ret == 0) { > + hvnd_drop_uctx_ref(nd_dev, qp->uctx); > + kfree(qp); > + } else { > + hvnd_error("free qp failed: ret is %d\n", ret); > + } > + > + return ret; > +} > + > +static struct ib_cq *hvnd_ib_create_cq(struct ib_device *ibdev, > + const struct ib_cq_init_attr *attr, > + struct ib_ucontext *ib_context, > + struct ib_udata *udata) > +{ > + struct hvnd_ucontext *uctx; > + struct hvnd_dev *nd_dev; > + struct mlx4_ib_create_cq ucmd; > + struct hvnd_cq *cq; > + int ret = 0; > + int entries = attr->cqe; > + > + uctx = to_nd_context(ib_context); > + nd_dev = to_nd_dev(ibdev); > + > + if (entries < 1 || entries > uctx->max_cqe) { > + hvnd_error("incorrct entries=%d\n", entries); > + ret = -EINVAL; > + goto err_done; > + } > + > + cq = kzalloc(sizeof(*cq), GFP_KERNEL); > + if (!cq) { > + ret = -ENOMEM; > + goto err_done; > + } > + > + entries = roundup_pow_of_two(entries + 1); > + cq->ibcq.cqe = entries - 1; > + cq->entries = entries; > + cq->uctx = uctx; > + > + if (ib_copy_from_udata(&ucmd, udata, sizeof(ucmd))) { > + hvnd_error("ib_copy_from_udata failed\n"); > + ret = -EFAULT; > + goto err_ucpy; > + } > + > + cq->cq_buf = (void *)ucmd.buf_addr; > + cq->db_addr = (void *)ucmd.db_addr; > + cq->arm_sn = 0; > + > + /* > + * Initialize the IRP state. Need to have a separate irp state > + * for CQ; for now share it with Listener/connector. > + */ > + ret = hvnd_init_ep(&cq->ep_object, NULL, ND_CQ, nd_dev, uctx); > + > + if (ret) { > + hvnd_error("hvnd_init_ep failed ret=%d\n", ret); > + goto err_ucpy; > + } > + > + cq->ep_object.cq = cq; > + cq->monitor = true; > + > + cq->umem = ib_umem_get(ib_context, ucmd.buf_addr, > + (entries * uctx->cqe_size), > + IB_ACCESS_LOCAL_WRITE, 1); > + if (IS_ERR(cq->umem)) { > + ret = IS_ERR(cq->umem); > + hvnd_error("ib_umem_get failed ret=%d\n", ret); > + goto err_ucpy; > + } > + > + ret = hvnd_db_map_user(uctx, ucmd.db_addr, &cq->db_umem); > + > + if (ret) { > + hvnd_error("hvnd_db_map_user failed ret=%d\n", ret); > + goto err_db_map; > + } > + > + ret = hvnd_create_cq(nd_dev, uctx, cq); > + > + if (ret) { > + hvnd_error("hvnd_create_cq failed ret=%d\n", ret); > + goto err_cq; > + } > + > + cq->ep_object.ep_handle = cq->cq_handle; > + > + if (ib_copy_to_udata(udata, &cq->cqn, sizeof(__u32))) { > + hvnd_error("ib_copy_to_udata failed\n"); > + ret = -EFAULT; > + goto err_ucpy_out; > + } > + > + if (!disable_cq_notify) { > + > + if (!ep_add_work_pending(&cq->ep_object)) > + goto err_ucpy_out; > + > + ret = hvnd_notify_cq(nd_dev, cq, ND_CQ_NOTIFY_ANY, > + (u64)&cq->ep_object); > + > + if (ret) { > + ep_del_work_pending(&cq->ep_object); > + hvnd_error("hvnd_notify_cq failed ret=%d\n", ret); > + goto err_ucpy_out; > + } > + } > + > + hvnd_acquire_uctx_ref(uctx); > + > + return &cq->ibcq; > + > +err_ucpy_out: > + hvnd_destroy_cq(nd_dev, cq); > + > +err_cq: > + hvnd_db_unmap_user(uctx, ucmd.db_addr); > + > +err_db_map: > + ib_umem_release(cq->umem); > + > +err_ucpy: > + kfree(cq); > +err_done: > + return ERR_PTR(ret); > +} > + > +static struct ib_qp *hvnd_get_qp(struct ib_device *dev, int qpn) > +{ > + struct hvnd_dev *nd_dev; > + struct hvnd_qp *qp = NULL; > + > + nd_dev = to_nd_dev(dev); > + qp = get_qpp(nd_dev, qpn); > + return qp ? &qp->ibqp : NULL; > +} > + > +static int hvnd_ib_destroy_cq(struct ib_cq *ib_cq) > +{ > + struct hvnd_ucontext *uctx; > + struct hvnd_dev *nd_dev; > + struct hvnd_cq *cq; > + > + cq = to_nd_cq(ib_cq); > + uctx = cq->uctx; > + nd_dev = to_nd_dev(uctx->ibucontext.device); > + > + cq->monitor = false; > + > +// hvnd_cancel_io(&cq->ep_object); > + ep_stop(&cq->ep_object); > + > + hvnd_deinit_ep(&cq->ep_object); > + > + hvnd_db_unmap_user(uctx, (u64)cq->db_addr); > + ib_umem_release(cq->umem); > + > + hvnd_destroy_cq(nd_dev, cq); > + > + hvnd_drop_uctx_ref(nd_dev, uctx); > + kfree(cq); > + > + return 0; > +} > + > +static int hvnd_resize_cq(struct ib_cq *cq, int cqe, struct ib_udata *udata) > +{ > + /* > + * NDDirect does not support resizing CQ. > + */ > + hvnd_info("check code\n"); > + return -ENOSYS; > +} > + > +static int hvnd_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc) > +{ > + hvnd_info("check code\n"); > + return 0; > +} > + > +static struct ib_mr *hvnd_get_dma_mr(struct ib_pd *pd, int acc) > +{ > + hvnd_info("check code\n"); > + return NULL; > +} > + > +static void debug_dump_umem(struct ib_umem *umem) > +{ > +#ifdef HVND_MEM_DEBUG > + struct ib_umem_chunk *chunk; > + struct scatterlist *sg; > + int len, j, entry; > + int shift = ffs(umem->page_size) - 1; > + > + hvnd_debug("umem=%p\n", umem); > + hvnd_debug("context=%p length=%lu offset=%d page_size=%d writable=%d hugetlb=%d\n", > + umem->context, > + umem->length, > + umem->offset, > + umem->page_size, > + umem->writable, > + umem->hugetlb); > + > + list_for_each_entry(chunk, &umem->chunk_list, list) { > + hvnd_debug("chunk->nmap=%d\n", chunk->nmap); > + for (j = 0; j < chunk->nmap; ++j) { > + sg = &chunk->page_list[j]; > + hvnd_debug("sg_dma_len=%d sg_dma_address=%llx\n", > + sg_dma_len(sg), sg_dma_address(sg)); > + hvnd_debug("page_link=%lx offset=%u length=%u\n", > + sg->page_link, sg->offset, sg->length); > + len = sg_dma_len(&chunk->page_list[j]) >> shift; > + for_each_sg(&chunk->page_list[j], sg, len, entry) { > + hvnd_debug("PFN=%lu\n", > + page_to_pfn(sg_page(sg))); > + } > + } > + } > +#endif > +} > + > + > +static struct ib_mr *hvnd_reg_user_mr(struct ib_pd *pd, u64 start, u64 length, > + u64 virt, int acc, struct ib_udata *udata) > +{ > + int err = 0; > + struct hvnd_ib_pd *hvndpd = to_nd_pd(pd); > + struct hvnd_mr *mr; > + > + mr = kmalloc(sizeof(*mr), GFP_KERNEL); > + if (!mr) > + return ERR_PTR(-ENOMEM); > + mr->pd = hvndpd; > + > + mr->umem = ib_umem_get(pd->uobject->context, start, length, acc, 0); > + if (IS_ERR(mr->umem)) { > + err = PTR_ERR(mr->umem); > + hvnd_error("ib_umem_get failed ret=%d\n", err); > + kfree(mr); > + return ERR_PTR(err); > + } > + > + debug_dump_umem(mr->umem); > + > + mr->start = start; > + mr->length = length; > + mr->virt = virt; > + mr->acc = acc; > + > + hvnd_debug("start=%llx length=%llx virt=%llx acc=%d\n", > + start, length, virt, acc); > + > + /* > + * First create a memory region. > + */ > + err = hvnd_cr_mr(to_nd_dev(pd->device), > + to_nd_context(pd->uobject->context), hvndpd->handle, > + &mr->mr_handle); > + if (err) { > + hvnd_error("cr_mr failed; ret is %d\n", err); > + goto err; > + } > + > + err = hvnd_mr_register(to_nd_dev(pd->device), > + to_nd_context(pd->uobject->context), mr); > + > + if (err) > + goto err0; > + > + hvnd_acquire_uctx_ref(to_nd_context(pd->uobject->context)); > + > + return &mr->ibmr; > + > +err0: > + hvnd_free_mr(to_nd_dev(pd->device), > + to_nd_context(pd->uobject->context), mr->mr_handle); > +err: > + ib_umem_release(mr->umem); > + kfree(mr); > + return ERR_PTR(err); > +} > + > + > + > +static int hvnd_dereg_mr(struct ib_mr *ib_mr) > +{ > + int ret; > + struct hvnd_mr *mr = to_nd_mr(ib_mr); > + struct hvnd_ucontext *uctx = to_nd_context(ib_mr->pd->uobject->context); > + struct hvnd_dev *nd_dev = to_nd_dev(ib_mr->device); > + > + > + hvnd_debug("dereg_mr entering\n"); > + > + ret = hvnd_deregister_mr(nd_dev, uctx, mr->mr_handle); > + > + if (ret) { > + hvnd_error("hvnd_deregister_mr() failed: %x\n", ret); > + return ret; > + } > + /* > + * Now free up the memory region. > + */ > + > + ret = hvnd_free_mr(nd_dev, uctx, mr->mr_handle); > + if (ret) { > + hvnd_error("hvnd_free_mr() failed: %x\n", ret); > + return ret; > + } > + > + ib_umem_release(mr->umem); > + > + hvnd_drop_uctx_ref(nd_dev, uctx); > + kfree(mr); > + > + hvnd_debug("dereg_mr done\n"); > + return 0; > +} > + > +static struct ib_mw *hvnd_alloc_mw(struct ib_pd *pd, enum ib_mw_type type, > + struct ib_udata *udata) > +{ > + hvnd_info("check code\n"); > + return NULL; > +} > + > +static int hvnd_dealloc_mw(struct ib_mw *mw) > +{ > + debug_check(__func__, __LINE__); > + return 0; > +} > + > +static int hvnd_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags) > +{ > + struct hvnd_ucontext *uctx; > + struct hvnd_dev *nd_dev; > + struct hvnd_cq *cq; > + > + cq = to_nd_cq(ibcq); > + uctx = cq->uctx; > + nd_dev = to_nd_dev(uctx->ibucontext.device); > + > + > + debug_check(__func__, __LINE__); > + > + return 0; > +} > + > +static int hvnd_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, > + struct ib_send_wr **bad_wr) > +{ > + debug_check(__func__, __LINE__); > + return 0; > +} > + > +int hvnd_post_receive(struct ib_qp *ibqp, struct ib_recv_wr *wr, > + struct ib_recv_wr **bad_wr) > +{ > + debug_check(__func__, __LINE__); > + return 0; > +} > + > +static int hvnd_resolve_addr(struct sockaddr_in *laddr, > + struct sockaddr_in *raddr, > + struct if_physical_addr *phys_addrstruct) > +{ > + int ret; > + > + phys_addrstruct->length = ETH_ALEN; > + ret = hvnd_get_neigh_mac_addr((struct sockaddr *)laddr, > + (struct sockaddr *)raddr, > + phys_addrstruct->addr); > + > + hvnd_debug("Dest MAC is %pM\n", phys_addrstruct->addr); > + return ret; > +} > + > +static int hvnd_connect(struct iw_cm_id *cm_id, > + struct iw_cm_conn_param *conn_param) > +{ > + int ret = 0; > + struct hvnd_dev *nd_dev; > + struct hvnd_ep_obj *ep_object; > + struct sockaddr_in *raddr = (struct sockaddr_in *)&cm_id->remote_addr; > + struct sockaddr_in *laddr = (struct sockaddr_in *)&cm_id->local_addr; > + struct hvnd_qp *qp; > + struct if_physical_addr phys_addrstruct; > + union nd_sockaddr_inet dest_addr; > + u64 connector_handle; > + union nd_sockaddr_inet addr; > + char addr_buf[50]; > + > + if (cm_id->remote_addr.ss_family != AF_INET) { > + hvnd_error("cm_id->remote_addr.ss_family=%d not AF_INET\n", > + cm_id->remote_addr.ss_family); > + return -ENOSYS; > + } > + > + qp = get_qpp(to_nd_dev(cm_id->device), conn_param->qpn); > + > + if (!qp) { > + hvnd_error("failed to find qp conn_param->qpn=%d\n", > + conn_param->qpn); > + return -EINVAL; > + } > + > + cm_id->provider_data = qp; > + cm_id->add_ref(cm_id); > + qp->cm_id = cm_id; > + > + /* > + * Set the read/write limits. > + * Can we change the limits on a created QP? Luke? > + */ > + nd_dev = to_nd_dev(cm_id->device); > + ep_object = hvnd_setup_ep(cm_id, ND_CONNECTOR, nd_dev, qp->uctx); > + hvnd_debug("active connection: local irp is %d\n", > + ep_object->local_irp); > + if (!ep_object) { > + hvnd_error("hvnd_setup_ep failure\n"); > + ret = -ENOMEM; > + goto err_limit; > + } > + > + ret = hvnd_cr_connector(nd_dev, qp->uctx, > + &connector_handle); > + > + if (ret) { > + hvnd_error("hvnd_cr_connector failure ret=%d\n", ret); > + goto err_cr_connector; > + } > + > + hvnd_acquire_uctx_ref(qp->uctx); > + ep_object->ep_handle = connector_handle; > + ep_object->incoming = false; > + qp->connector = ep_object; > + > + /* > + * Bind the local address to the connector. > + */ > + hvnd_debug("Connect local address is %s\n", > + debug_inet_ntoa(laddr->sin_addr, addr_buf)); > + > + memcpy(&addr.ipv4, laddr, sizeof(struct sockaddr_in)); > + hvnd_debug("CONNECT AF %d port %d addr %s\n", > + addr.ipv4.sin_family, addr.ipv4.sin_port, > + debug_inet_ntoa(addr.ipv4.sin_addr, addr_buf)); > + > + ret = hvnd_bind_connector(nd_dev, qp->uctx, > + connector_handle, > + &addr); > + > + if (ret) { > + hvnd_error("hvnd_bind_connector failed ret=%d\n", ret); > + goto err_bind_connector; > + } > + > + ret = hvnd_resolve_addr(laddr, raddr, &phys_addrstruct); > + if (ret) { > + hvnd_error("hvnd_resolve_addr failed ret=%d\n", ret); > + goto err_bind_connector; > + } > + > + memcpy(&dest_addr.ipv4, raddr, sizeof(struct sockaddr_in)); > + > + > + > + /* > + * Now attempt to connect. > + */ > + > + hvnd_debug("About to initiate connection\n"); > + > + if (!ep_add_work_pending(ep_object)) > + goto err_bind_connector; > + > + ep_object->cm_state = hvnd_cm_connect_received; > + ret = hvnd_connector_connect(nd_dev, qp->uctx, > + ep_object->ep_handle, > + conn_param->ird, conn_param->ord, > + conn_param->private_data_len, > + (u8 *)conn_param->private_data, > + qp->qp_handle, > + &phys_addrstruct, &dest_addr, > + ep_object); > + > + if (ret == 0) > + return 0; > + > + ep_object->cm_state = hvnd_cm_idle; > + ep_del_work_pending(ep_object); > + hvnd_error("hvnd_connector_connect failed ret=%d\n", ret); > + > +err_bind_connector: > + qp->connector = NULL; > + hvnd_free_connector(nd_dev, qp->uctx, > + connector_handle); > + hvnd_drop_uctx_ref(nd_dev, qp->uctx); > + > +err_cr_connector: > + kfree(ep_object); > + > +err_limit: > + cm_id->provider_data = NULL; > + qp->cm_id = NULL; > + cm_id->rem_ref(cm_id); > + return ret; > +} > + > +static int hvnd_accept_cr(struct iw_cm_id *cm_id, > + struct iw_cm_conn_param *conn_param) > +{ > + int ret = 0; > + struct hvnd_dev *nd_dev; > + struct hvnd_qp *qp; > + struct hvnd_ep_obj *connector; > + enum ibv_qp_state new_qp_state; > + > + hvnd_debug("Accepting connection - PASSIVE\n"); > + nd_dev = to_nd_dev(cm_id->device); > + qp = get_qpp(to_nd_dev(cm_id->device), conn_param->qpn); > + > + if (!qp) { > + hvnd_error("get_qpp failed conn_param->qpn=%d\n", > + conn_param->qpn); > + return -EINVAL; > + } > + > + > + connector = (struct hvnd_ep_obj *)cm_id->provider_data; > + qp->connector = connector; > + > + if (connector == NULL) { > + hvnd_error("NULL connector!\n"); > + return -EINVAL; > + } > + hvnd_debug("connector's cm_id is %p caller cm_id=%p\n", > + connector->cm_id, cm_id); > + > + connector->cq = qp->recv_cq; > + > + > + /* > + * Setup state for the accepted connection. > + */ > + cm_id->add_ref(cm_id); > + connector->cm_id = cm_id; > + if (conn_param == NULL) { > + hvnd_error("NULL conn_param!\n"); > + return -EINVAL; > + } > + > + connector->ord = conn_param->ord; > + connector->ird = conn_param->ird; > + > + if (!ep_add_work_pending(connector)) > + goto error; > + > + init_completion(&connector->connector_accept_event); > + > + ret = hvnd_connector_accept(nd_dev, qp->uctx, connector->ep_handle, > + qp->qp_handle, conn_param->ird, > + conn_param->ord, > + conn_param->private_data_len, > + conn_param->private_data, > + &new_qp_state, connector); > + > + if (ret) { > + ep_del_work_pending(connector); > + hvnd_error("connector accept failed\n"); > + goto error; > + } > + > + wait_for_completion(&connector->connector_accept_event); > + ret = connector->connector_accept_status; > + > + if (ret) { > + hvnd_error("connector_accept failed status=%x\n", ret); > + ret = -EIO; > + goto error; > + } > + > + hvnd_debug("Passive Connection Accepted; new qp state is %d\n", > + new_qp_state); > + connector->cm_state = hvnd_cm_accept_sent; > + return 0; > + > +error: > + ep_stop(connector); > + connector->cm_id = NULL; > + connector->cm_state = hvnd_cm_idle; > + > + qp->connector = NULL; > + cm_id->rem_ref(cm_id); > + > + return ret; > +} > + > +static int hvnd_reject_cr(struct iw_cm_id *cm_id, const void *pdata, > + u8 pdata_len) > +{ > + debug_check(__func__, __LINE__); > + return 0; > +} > + > +void hvnd_process_disconnect(struct hvnd_ep_obj *ep_object, int status) > +{ > + struct iw_cm_event cm_event; > + > + switch (status) { > + case STATUS_SUCCESS: > + case STATUS_CANCELLED: > + break; > + > + default: > + hvnd_warn("disconnect complete failed: status:%d\n", status); > + } > + > + > + hvnd_debug("active disconnect processed\n"); > + memset(&cm_event, 0, sizeof(cm_event)); > + > + complete(&ep_object->disconnect_event); > +} > + > + > +void hvnd_process_notify_disconnect(struct hvnd_ep_obj *ep_object, int status) > +{ > + struct iw_cm_event cm_event; > + > + // make sure we only disconnect once > + if (atomic_xchg(&ep_object->disconnect_notified, 1)) > + return; > + > + /* > + * Turn off CQ monitoring. > + */ > + if (ep_object->cq) > + ep_object->cq->monitor = false; > + > + switch (ep_object->cm_state) { > + case hvnd_cm_connect_reply_sent: > + case hvnd_cm_established_sent: > + case hvnd_cm_accept_sent: > + break; > + > + default: > + hvnd_error("unexpected cm_state=%d\n", ep_object->cm_state); > + return; > + } > + > + switch (status) { > + case STATUS_SUCCESS: > + case STATUS_CANCELLED: > + case STATUS_DISCONNECTED: > + break; > + > + default: > + hvnd_warn("notify disconnect complete failed: status:%d\n", > + status); > + } > + > + hvnd_debug("passive disconnect notified\n"); > + memset(&cm_event, 0, sizeof(cm_event)); > + > + /* > + * Other end disconnected. > + * Connection has been disconnected; > + * notify the cm layer. > + */ > + cm_event.status = -ECONNRESET; > + cm_event.event = IW_CM_EVENT_CLOSE; > + > + if ((ep_object->cm_id) && > + (ep_object->cm_id->event_handler)) { > + > + ep_object->cm_id->event_handler(ep_object->cm_id, &cm_event); > + > + ep_object->cm_id->rem_ref(ep_object->cm_id); > + ep_object->cm_state = hvnd_cm_close_sent; > + } > +} > + > +void hvnd_process_connector_accept(struct hvnd_ep_obj *ep_object, int status) > +{ > + struct iw_cm_event cm_event; > + int ret; > + > + /* > + * This is the problem area the return status may be: > + * 1: 0xc00000b5 (3221225653) - {Device Timeout} The specified I/O > + * operation on %hs was not completed before the time-out period > + * expired > + * 2: NTSTATUS 0xc0000241 (3221226049) - The transport connection was > + * aborted by the local system. > + * If we do nothing here, iwcm will wait for IW_CM_EVENT_ESTABLISHED > + * forever, and unable to clean shutdown need to fail the call earlier > + * on accept. > + */ > + > + ep_object->connector_accept_status = status; > + > + if (status) { > + hvnd_error("Connector accept failed; status is %x\n", status); > + complete(&ep_object->connector_accept_event); > + return; > + } > + > + memset(&cm_event, 0, sizeof(cm_event)); > + cm_event.event = IW_CM_EVENT_ESTABLISHED; > + cm_event.ird = ep_object->ird; > + cm_event.ord = ep_object->ord; > + cm_event.provider_data = (void *)ep_object; > + > + /* > + * We have successfully passively accepted the > + * incoming connection. > + */ > + > + hvnd_debug("Passive connection accepted!!\n"); > + if ((ep_object->cm_id) && > + (ep_object->cm_id->event_handler)) { > + ep_object->cm_id->event_handler(ep_object->cm_id, &cm_event); > + ep_object->cm_state = hvnd_cm_established_sent; > + } > + > + complete(&ep_object->connector_accept_event); > + > + /* > + * Request notification if the other end > + * were to disconnect. > + */ > + if (!ep_add_work_pending(ep_object)) > + return; > + > + ret = hvnd_connector_notify_disconnect(ep_object->nd_dev, > + ep_object->uctx, > + ep_object->ep_handle, > + ep_object); > + > + if (ret) { > + ep_del_work_pending(ep_object); > + hvnd_error("Connector notify disconnect failed; ret: %d\n", > + ret); > + } > +} > + > + > +void hvnd_process_cq_event_pending(struct hvnd_ep_obj *ep_object, > + int status) > +{ > + > + struct ib_cq *ibcq; > + struct hvnd_cq *cq; > + > + cq = ep_object->cq; > + ibcq = &ep_object->cq->ibcq; > + > + if (!cq->monitor) > + return; > + > + // call the previous CQ complete > + if ((status == STATUS_PENDING) && cq->upcall_pending && > + ibcq->comp_handler) { > + ibcq->comp_handler(ibcq, ibcq->cq_context); > + cq->upcall_pending = false; > + hvnd_debug("CQ comp_handler called arm_sn=%d\n", cq->arm_sn); > + } > + > + if ((status != STATUS_PENDING) && ibcq->comp_handler && > + ibcq->cq_context) { > + ibcq->comp_handler(ibcq, ibcq->cq_context); > + hvnd_error("CQ comp_handler called status=%x\n", status); > + } > +} > + > +void hvnd_process_cq_event_complete(struct hvnd_ep_obj *ep_object, > + int status) > +{ > + struct ib_cq *ibcq; > + struct hvnd_cq *cq; > + int ret; > + > + cq = ep_object->cq; > + ibcq = &ep_object->cq->ibcq; > + > + // call hte previous CQ complete > + if (cq->upcall_pending && ibcq->comp_handler) { > + ibcq->comp_handler(ibcq, ibcq->cq_context); > + cq->upcall_pending = false; > + hvnd_debug("CQ comp_handler called arm_sn=%d\n", cq->arm_sn); > + } > + > + cq->upcall_pending = true; > + if (!ep_add_work_pending(ep_object)) > + return; > + > + ret = hvnd_notify_cq(ep_object->nd_dev, > + ep_object->cq, > + ND_CQ_NOTIFY_ANY, > + (u64)ep_object); > + > + if (ret) { > + ep_del_work_pending(ep_object); > +// hvnd_manage_io_state(ep_object, true); > + hvnd_error("hvnd_notify_cq failed ret=%d\n", ret); > + } > + > + if ((status != 0) && (status != STATUS_CANCELLED)) { > + if (ibcq->event_handler) { > + struct ib_event event; > + > + event.device = ibcq->device; > + event.event = IB_EVENT_CQ_ERR; > + event.element.cq = ibcq; > + ibcq->event_handler(&event, ibcq->cq_context); > + > + hvnd_warn("CQ event_handler called status=%x\n", > + status); > + } > + } > +} > + > +int init_cm_event(struct hvnd_ep_obj *ep_object, struct iw_cm_event *cm_event, > + int event) > +{ > + struct sockaddr_in *laddr = (struct sockaddr_in *)&cm_event->local_addr; > + struct sockaddr_in *raddr = (struct sockaddr_in *)&cm_event->remote_addr; > + struct nd_read_limits rd_limits; > + union nd_sockaddr_inet local_addr; > + union nd_sockaddr_inet remote_addr; > + int ret; > + > + /* > + * Now get the local address. > + */ > + ret = hvnd_connector_get_local_addr(ep_object->nd_dev, > + ep_object->uctx, > + ep_object->ep_handle, > + &local_addr); > + > + if (ret) { > + hvnd_error("Connector get addr failed; ret: %d\n", ret); > + return ret; > + } > + /* > + * Now get the remote address. > + */ > + ret = hvnd_connector_get_peer_addr(ep_object->nd_dev, > + ep_object->uctx, > + ep_object->ep_handle, > + &remote_addr); > + > + if (ret) { > + hvnd_error("Connector get peer addr failed; ret: %d\n", ret); > + return ret; > + } > + > + /* > + * Get other connection parameters. > + */ > + > + ret = hvnd_connector_get_rd_limits(ep_object->nd_dev, > + ep_object->uctx, > + ep_object->ep_handle, > + &rd_limits); > + > + if (ret) { > + hvnd_error("Connector rd limits failed; ret: %d\n", ret); > + return ret; > + } > + > + /* > + * XXXKYS: Luke: What about the length of the priv data? > + */ > + ret = hvnd_connector_get_priv_data(ep_object->nd_dev, > + ep_object->uctx, > + ep_object->ep_handle, > + ep_object->priv_data); > + > + if (ret) { > + hvnd_error("Connector get priv data failed; ret: %d\n", ret); > + return ret; > + } > + /* > + * Initialize CM structure. > + */ > + laddr->sin_addr.s_addr = local_addr.ipv4.sin_addr.s_addr; > + hvnd_debug("Local addr is %d\n", laddr->sin_addr.s_addr); > + laddr->sin_port = local_addr.ipv4.sin_port; > + laddr->sin_family = AF_INET; > + > + raddr->sin_addr.s_addr = remote_addr.ipv4.sin_addr.s_addr; > + hvnd_debug("Remote addr is %d\n", raddr->sin_addr.s_addr); > + raddr->sin_port = remote_addr.ipv4.sin_port; > + raddr->sin_family = AF_INET; > + > + // KYS; LUke: is it always 148 bytes? > + cm_event->private_data_len = MAX_PRIVATE_DATA_LEN; > + cm_event->private_data = ep_object->priv_data; > + > + cm_event->ird = rd_limits.inbound; > + cm_event->ord = rd_limits.outbound; > + cm_event->event = event; > + > + ep_object->ird = cm_event->ird; > + ep_object->ord = cm_event->ord; > + > + return 0; > +} > + > +static void hvnd_refuse_connection(struct hvnd_ep_obj *connector, int status) > +{ > + struct iw_cm_event cm_event; > + > + memset(&cm_event, 0, sizeof(cm_event)); > + > + cm_event.event = IW_CM_EVENT_CONNECT_REPLY; > + cm_event.status = status; > + > + hvnd_debug("returning status %d on connector %p\n", status, connector); > + > + if (connector->cm_id && connector->cm_id->event_handler) { > + connector->cm_id->event_handler(connector->cm_id, &cm_event); > + connector->cm_id->rem_ref(connector->cm_id); > + connector->cm_state = hvnd_cm_connect_reply_refused; > + } > +} > + > +void hvnd_process_events(struct work_struct *work) > +{ > + struct hvnd_work *wrk; > + struct nd_read_limits rd_limits; > + struct hvnd_ep_obj *ep_object; > + struct hvnd_ep_obj *parent; > + struct iw_cm_event cm_event; > + struct sockaddr_in *laddr = (struct sockaddr_in *)&cm_event.local_addr; > + struct sockaddr_in *raddr = (struct sockaddr_in *)&cm_event.remote_addr; > + struct ndv_packet_hdr_control_1 *ctrl_hdr; > + union nd_sockaddr_inet local_addr; > + union nd_sockaddr_inet remote_addr; > + struct pkt_nd_get_connection_listener *connection_pkt; > + struct iw_cm_id *cm_id = NULL; > + int status; > + int ioctl; > + int ret; > + char priv_data[MAX_PRIVATE_DATA_LEN]; > + enum ibv_qp_state new_qp_state; > + struct incoming_pkt *incoming_pkt; > + unsigned long flags; > + > + memset(&cm_event, 0, sizeof(cm_event)); > + memset(&priv_data, 0, MAX_PRIVATE_DATA_LEN); > + > + > + wrk = container_of(work, struct hvnd_work, work); > + > + /* > + * Now call into the connection manager. > + */ > + ep_object = (struct hvnd_ep_obj *)wrk->callback_arg; > + parent = ep_object->parent; > + > +process_next: > + incoming_pkt = NULL; > + spin_lock_irqsave(&ep_object->incoming_pkt_list_lock, flags); > + if (!list_empty(&ep_object->incoming_pkt_list)) { > + incoming_pkt = list_first_entry(&ep_object->incoming_pkt_list, > + struct incoming_pkt, > + list_entry); > + list_del(&incoming_pkt->list_entry); > + } > + spin_unlock_irqrestore(&ep_object->incoming_pkt_list_lock, flags); > + if (incoming_pkt == NULL) > + return; > + > + ctrl_hdr = (struct ndv_packet_hdr_control_1 *)incoming_pkt->pkt; > + status = ctrl_hdr->io_status; > + ioctl = ctrl_hdr->io_cntrl_code; > + > + hvnd_debug("Process Events IOCTL is: %s; iostatus failure: %x in work queue\n", > + hvnd_get_op_name(ioctl), status); > + > + if (status != 0) { > + bool log_error = true; > + > + if (ioctl == IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT && > + status == STATUS_DISCONNECTED) // expected > + log_error = false; > + > + if (log_error) > + hvnd_warn("Process Events IOCTL is: %s; iostatus failure: %x\n", > + hvnd_get_op_name(ioctl), status); > + } > + > + cm_event.status = status; > + > + switch (ep_object->type) { > + case ND_CONNECTOR: > + switch (ioctl) { > + case IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST: > + > + if (ep_object->parent != NULL) { > + > + // Do nothing with this connection request if > + // listener is stopping > + if (!ep_add_work_pending(ep_object->parent)) > + break; > + > + cm_id = ep_object->parent->cm_id; //Listener > + } > + > + connection_pkt = (struct pkt_nd_get_connection_listener *) ctrl_hdr; > + > + if ((status == 0) || (status == STATUS_CANCELLED)) { > + hvnd_get_incoming_connections(ep_object->parent, > + ep_object->parent->nd_dev, > + ep_object->uctx); > + } > + > + if (status) > + goto get_connection_request_done; > + > + /* > + * Now get the local address. > + */ > + ret = hvnd_connector_get_local_addr(ep_object->nd_dev, > + ep_object->uctx, > + ep_object->ep_handle, > + &local_addr); > + > + if (ret) { > + hvnd_error("Connector get addr failed; ret: %d\n", > + ret); > + goto get_connection_request_done; > + } > + /* > + * Now get the remote address. > + */ > + ret = hvnd_connector_get_peer_addr(ep_object->nd_dev, > + ep_object->uctx, > + ep_object->ep_handle, > + &remote_addr); > + > + if (ret) { > + hvnd_error("Connector get peer addr failed; ret: %d\n", > + ret); > + goto get_connection_request_done; > + } > + /* > + * Get other connection parameters. > + */ > + > + ret = hvnd_connector_get_rd_limits(ep_object->nd_dev, > + ep_object->uctx, > + ep_object->ep_handle, > + &rd_limits); > + > + if (ret) { > + hvnd_error("Connector rd imits failed; ret: %d\n", > + ret); > + goto get_connection_request_done; > + } > + > + /* > + * XXXKYS: Luke: What about the length of the priv data? > + */ > + ret = hvnd_connector_get_priv_data( > + ep_object->nd_dev, ep_object->uctx, > + ep_object->ep_handle, ep_object->priv_data); > + > + if (ret) { > + hvnd_error("Connector get priv data failed; ret: %d\n", > + ret); > + goto get_connection_request_done; > + } > + > + cm_event.event = IW_CM_EVENT_CONNECT_REQUEST; > + cm_event.provider_data = (void *)ep_object; > + > + laddr->sin_addr.s_addr = local_addr.ipv4.sin_addr.s_addr; > + hvnd_debug("Local addr is %d\n", > + laddr->sin_addr.s_addr); > + laddr->sin_port = local_addr.ipv4.sin_port; > + laddr->sin_family = AF_INET; > + > + raddr->sin_addr.s_addr = remote_addr.ipv4.sin_addr.s_addr; > + hvnd_debug("Remote addr is %d\n", > + raddr->sin_addr.s_addr); > + raddr->sin_port = remote_addr.ipv4.sin_port; > + raddr->sin_family = AF_INET; > + > + // KYS; LUke: is it always 148 bytes? > + cm_event.private_data_len = MAX_PRIVATE_DATA_LEN; > + cm_event.private_data = ep_object->priv_data; > + > + cm_event.ird = rd_limits.inbound; > + cm_event.ord = rd_limits.outbound; > + > + ep_object->ird = cm_event.ird; > + ep_object->ord = cm_event.ord; > + > + > + if ((cm_id != NULL) && cm_id->event_handler) { > + cm_id->event_handler(cm_id, &cm_event); > + ep_object->cm_state = hvnd_cm_connect_request_sent; > + } > + > +get_connection_request_done: > + if (ep_object->parent != NULL) > + ep_del_work_pending(ep_object->parent); > + break; > + > + case IOCTL_ND_CONNECTOR_CONNECT: > + > + cm_event.event = IW_CM_EVENT_CONNECT_REPLY; > + if (status == STATUS_TIMEOUT && > + ep_object->connector_connect_retry < 3) { > + //TIMEOUT retry > + > + if (!ep_add_work_pending(ep_object)) > + goto refuse_connection; > + > + hvnd_warn("Connector connect timed out, reconnecting... retry count: %d\n", > + ep_object->connector_connect_retry); > + ep_object->connector_connect_retry++; > + ret = hvnd_send_ioctl_pkt(ep_object->nd_dev, > + &ep_object->connector_connect_pkt.hdr, > + sizeof(ep_object->connector_connect_pkt), > + (u64)&ep_object->connector_connect_pkt); > + > + if (ret) { > + hvnd_error("Connector on time out failed: %d\n", > + ret); > + ep_del_work_pending(ep_object); > + goto refuse_connection; > + } > + break; > + } > + > +refuse_connection: > + if (status) { > + cm_event.status = -ECONNREFUSED; > + if (status == STATUS_TIMEOUT) > + cm_event.status = -ETIMEDOUT; > + > + hvnd_refuse_connection(ep_object, > + cm_event.status); > + break; > + } > + > + hvnd_debug("ACTIVE Connection ACCEPTED\n"); > + ret = init_cm_event(ep_object, &cm_event, > + IW_CM_EVENT_CONNECT_REPLY); > + if (ret) { > + hvnd_error("init_cm_event failed ret=%d\n", > + ret); > + goto process_done; > + } > + > + ret = hvnd_connector_complete_connect(ep_object->nd_dev, > + ep_object->uctx, > + ep_object->ep_handle, > + &new_qp_state); > + if (ret) { > + hvnd_error("connector_complete failed\n"); > + goto process_done; > + } > + > + cm_event.provider_data = (void *)ep_object; > + > + if ((ep_object->cm_id) && > + (ep_object->cm_id->event_handler)) { > + ep_object->cm_id->event_handler( > + ep_object->cm_id, &cm_event); > + ep_object->cm_state = hvnd_cm_connect_reply_sent; > + } > + /* > + * Rquest notification if the other end > + * were to disconnect. > + */ > + if (!ep_add_work_pending(ep_object)) > + goto process_done; > + > + ret = hvnd_connector_notify_disconnect( > + ep_object->nd_dev, ep_object->uctx, > + ep_object->ep_handle, ep_object); > + > + if (ret) { > + ep_del_work_pending(ep_object); > + hvnd_error("Connector notify disconnect failed; ret: %d\n", > + ret); > + } > + > + break; > + > + case IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT: > + hvnd_process_notify_disconnect(ep_object, status); > + break; > + > + > + default: > + hvnd_error("Unknown Connector IOCTL\n"); > + break; > + } > + break; > + default: > + hvnd_error("Unknown endpoint object\n"); > + break; > + } > +process_done: > + kfree(incoming_pkt); > + ep_del_work_pending(ep_object); > + > + goto process_next; > +} > + > + > +static struct hvnd_ep_obj *hvnd_setup_ep(struct iw_cm_id *cm_id, int ep_type, > + struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx) > +{ > + struct hvnd_ep_obj *ep_object; > + int ret; > + > + ep_object = kzalloc(sizeof(struct hvnd_ep_obj), GFP_KERNEL); > + > + if (!ep_object) > + return NULL; > + > + ret = hvnd_init_ep(ep_object, cm_id, ep_type, nd_dev, uctx); > + > + if (ret) { > + hvnd_error("hvnd_init_ep failed ret=%d\n", ret); > + kfree(ep_object); > + return NULL; > + } > + > + return ep_object; > +} > + > +/* > + * return value: > + * true: I/O state is stopped, we should not do upcall > + * flase: I/O state is running and normal > + */ > +#if 0 > +static bool hvnd_manage_io_state(struct hvnd_ep_obj *ep, bool failure) > +{ > + unsigned long flags; > + > + spin_lock_irqsave(&ep->ep_lk, flags); > + if (ep->to_be_destroyed) { > + hvnd_warn("ep being destroyed\n"); > + if (ep->io_outstanding) { > + hvnd_warn("ep being destroyed i/O pending waking up on %p\n", > + &ep->block_event); > + complete(&ep->block_event); > + ep->io_outstanding = false; > + } > + spin_unlock_irqrestore(&ep->ep_lk, flags); > + return true; > + } > + if (!failure) > + ep->io_outstanding = true; > + spin_unlock_irqrestore(&ep->ep_lk, flags); > + return false; > +} > +#endif > + > +static int hvnd_get_incoming_connections(struct hvnd_ep_obj *listener, > + struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx) > +{ > + struct hvnd_ep_obj *connector; > + u64 connector_handle; > + int ret; > + > + /* > + * First handle the protocol for > + * destruction - outstanding I/O. > + */ > + > +// if (hvnd_manage_io_state(listener, false)) > +// return 0; > + /* > + * Create a connector. > + */ > + connector = hvnd_setup_ep(listener->cm_id, ND_CONNECTOR, nd_dev, uctx); > + if (!connector) { > + hvnd_error("hvnd_setup_ep failed\n"); > + ret = -ENOMEM; > + goto con_alloc_err; > + } > + > + ret = hvnd_cr_connector(nd_dev, uctx, > + &connector_handle); > + if (ret) { > + hvnd_error("hvnd_cr_connector failed ret=%d\n", ret); > + goto con_cr_err; > + } > + > + /* > + * Now get a connection if one is pending. > + */ > + connector->ep_handle = connector_handle; > + connector->parent = listener; > + > + if (!ep_add_work_pending(connector)) > + goto get_connection_err; > + > + ret = hvnd_get_connection_listener(nd_dev, uctx, > + listener->ep_handle, > + connector_handle, > + (u64)connector); > + > + if (ret) { > + hvnd_debug("listener_get_connection failed\n"); > + ep_del_work_pending(connector); > + goto get_connection_err; > + } > + > + hvnd_acquire_uctx_ref(uctx); > + listener->outstanding_handle = connector_handle; > + listener->outstanding_ep = connector; > + hvnd_debug("outstanding handle is %p\n", (void *)connector_handle); > + return 0; > + > +get_connection_err: > + hvnd_free_handle(nd_dev, uctx, > + connector_handle, > + IOCTL_ND_CONNECTOR_FREE); > + > +con_cr_err: > + kfree(connector); > +con_alloc_err: > +// hvnd_manage_io_state(listener, true); > + return ret; > +} > + > +static int hvnd_create_listen(struct iw_cm_id *cm_id, int backlog) > +{ > + int ret = 0; > + struct hvnd_dev *nd_dev; > + struct hvnd_ucontext *uctx; > + struct hvnd_ep_obj *ep_object; > + union nd_sockaddr_inet addr; > + union nd_sockaddr_inet local_addr; > + u64 listener_handle; > + struct sockaddr_in *laddr = (struct sockaddr_in *)&cm_id->local_addr; > + union nd_sockaddr_inet og_addr; > + > + > + nd_dev = to_nd_dev(cm_id->device); > + uctx = get_uctx(nd_dev, current_pid()); > + hvnd_debug("uctx is %p; pid is %d\n", uctx, current_pid()); > + > + if (cm_id->local_addr.ss_family != AF_INET) { > + hvnd_error("cm_id->local_addr.ss_family =%d not AF_INET\n", > + cm_id->local_addr.ss_family); > + return -ENOSYS; > + } > + > + /* > + * If the local address is LOOPBACK or INADDR_ANY, get an an address > + * to bind the listener. For now, just get the first address > + * available. > + */ > + > + if (IN_LOOPBACK(ntohl(laddr->sin_addr.s_addr)) || > + (laddr->sin_addr.s_addr == INADDR_ANY)) { > + > + hvnd_debug("need to get an address\n"); > + ret = hvnd_get_outgoing_rdma_addr(nd_dev, uctx, &og_addr); > + > + if (ret) { > + hvnd_error("failed to get the og address\n"); > + return ret; > + } > + > + laddr->sin_addr.s_addr = og_addr.ipv4.sin_addr.s_addr; > + } > + > + cm_id->add_ref(cm_id); > + > + ep_object = hvnd_setup_ep(cm_id, ND_LISTENER, nd_dev, uctx); > + > + if (!ep_object) { > + hvnd_error("hvnd_setup_ep returned NULL\n"); > + goto alloc_err; > + } > + > + ret = hvnd_cr_listener(nd_dev, uctx, > + &listener_handle); > + if (ret) { > + hvnd_error("hvnd_cr_listener failed ret=%d\n", ret); > + goto cr_err; > + } > + > + ep_object->ep_handle = listener_handle; > + > + cm_id->provider_data = ep_object; > + > + /* > + * Now bind the listener. > + * IPV4 support only. > + */ > + memcpy(&addr.ipv4, laddr, sizeof(struct sockaddr_in)); > + > + ret = hvnd_bind_listener(nd_dev, uctx, > + listener_handle, > + &addr); > + if (ret) { > + hvnd_error("hvnd_bind_listener failed ret=%d\n", ret); > + goto bind_err; > + } > + > + /* > + * Now get the local address. > + */ > + ret = hvnd_get_addr_listener(nd_dev, uctx, > + listener_handle, > + &local_addr); > + if (ret) { > + hvnd_error("hvnd_get_addr_listener failed ret=%d\n", ret); > + goto bind_err; > + } > + > + /* > + * Now put the listener in the listen mode. > + */ > + > + ret = hvnd_listen_listener(nd_dev, uctx, > + listener_handle, > + backlog); > + > + if (ret) { > + hvnd_error("hvnd_listen_listener failed ret=%d\n", ret); > + goto bind_err; > + } > + > + > + /* > + * Now get a pending connection if one is pending. > + */ > + ret = hvnd_get_incoming_connections(ep_object, nd_dev, uctx); > + if (ret) { > + hvnd_error("hvnd_get_incoming_connections failed ret=%d\n", > + ret); > + goto bind_err; > + } > + > + hvnd_acquire_uctx_ref(uctx); > + hvnd_debug("cm_id=%p\n", cm_id); > + return 0; > + > +bind_err: > + hvnd_free_handle(nd_dev, uctx, > + listener_handle, > + IOCTL_ND_LISTENER_FREE); > +cr_err: > + kfree(ep_object); > +alloc_err: > + cm_id->provider_data = NULL; > + cm_id->rem_ref(cm_id); > + return ret; > +} > + > +static int hvnd_destroy_listen(struct iw_cm_id *cm_id) > +{ > + struct hvnd_dev *nd_dev; > + struct hvnd_ucontext *uctx; > + struct hvnd_ep_obj *ep_object; > + > + nd_dev = to_nd_dev(cm_id->device); > + > + ep_object = (struct hvnd_ep_obj *)cm_id->provider_data; > + > + hvnd_debug("uctx is %p\n", ep_object->uctx); > + hvnd_debug("Destroying Listener cm_id=%p\n", cm_id); > + uctx = ep_object->uctx; > + > + // make sure there is nothing in progress on this ep > + ep_stop(ep_object); > + > + hvnd_free_handle(nd_dev, uctx, > + ep_object->ep_handle, > + IOCTL_ND_LISTENER_FREE); > + > + /* > + * We may have an ouststanding connector for > + * incoming connection requests; clean it up. > + */ > + > + if (ep_object->outstanding_handle != 0) { > + > + // make sure there is nothing in progress on this ep > + ep_stop(ep_object->outstanding_ep); > + > + hvnd_free_handle(nd_dev, uctx, > + ep_object->outstanding_handle, > + IOCTL_ND_CONNECTOR_FREE); > + > + > + hvnd_drop_uctx_ref(nd_dev, uctx); > + hvnd_destroy_ep(ep_object->outstanding_ep); > + } > + > + /* > + * Now everything should have stopped > + */ > + > + cm_id->rem_ref(cm_id); > + hvnd_destroy_ep(ep_object); > + cm_id->provider_data = NULL; > + hvnd_drop_uctx_ref(nd_dev, uctx); > + > + hvnd_debug("cm_id=%p\n", cm_id); > + return 0; > +} > + > +static void hvnd_qp_add_ref(struct ib_qp *ibqp) > +{ > + struct hvnd_qp *qp = to_nd_qp(ibqp); > + > + atomic_inc(&qp->refcnt); > +} > + > +void hvnd_qp_rem_ref(struct ib_qp *ibqp) > +{ > + struct hvnd_qp *qp = to_nd_qp(ibqp); > + > + if (atomic_dec_and_test(&qp->refcnt)) > + wake_up(&qp->wait); > +} > + > +static DEVICE_ATTR(hw_rev, 0444, hvnd_show_rev, NULL); > +static DEVICE_ATTR(fw_ver, 0444, hvnd_show_fw_ver, NULL); > +static DEVICE_ATTR(hca_type, 0444, hvnd_show_hca, NULL); > +static DEVICE_ATTR(board_id, 0444, hvnd_show_board, NULL); > + > +static struct device_attribute *hvnd_class_attributes[] = { > + &dev_attr_hw_rev, > + &dev_attr_fw_ver, > + &dev_attr_hca_type, > + &dev_attr_board_id, > +}; > + > +int hvnd_register_device(struct hvnd_dev *dev, char *ip_addr, char *mac_addr) > +{ > + int ret; > + > + dev->ibdev.owner = THIS_MODULE; > + dev->device_cap_flags = IB_DEVICE_LOCAL_DMA_LKEY | IB_DEVICE_MEM_WINDOW; > + dev->ibdev.local_dma_lkey = 0; > + dev->ibdev.uverbs_cmd_mask = > + (1ull << IB_USER_VERBS_CMD_GET_CONTEXT) | > + (1ull << IB_USER_VERBS_CMD_QUERY_DEVICE) | > + (1ull << IB_USER_VERBS_CMD_QUERY_PORT) | > + (1ull << IB_USER_VERBS_CMD_ALLOC_PD) | > + (1ull << IB_USER_VERBS_CMD_DEALLOC_PD) | > + (1ull << IB_USER_VERBS_CMD_REG_MR) | > + (1ull << IB_USER_VERBS_CMD_DEREG_MR) | > + (1ull << IB_USER_VERBS_CMD_CREATE_COMP_CHANNEL) | > + (1ull << IB_USER_VERBS_CMD_CREATE_CQ) | > + (1ull << IB_USER_VERBS_CMD_DESTROY_CQ) | > + (1ull << IB_USER_VERBS_CMD_REQ_NOTIFY_CQ) | > + (1ull << IB_USER_VERBS_CMD_CREATE_QP) | > + (1ull << IB_USER_VERBS_CMD_MODIFY_QP) | > + (1ull << IB_USER_VERBS_CMD_QUERY_QP) | > + (1ull << IB_USER_VERBS_CMD_POLL_CQ) | > + (1ull << IB_USER_VERBS_CMD_DESTROY_QP) | > + (1ull << IB_USER_VERBS_CMD_POST_SEND) | > + (1ull << IB_USER_VERBS_CMD_POST_RECV); > + dev->ibdev.node_type = RDMA_NODE_RNIC; > + memcpy(dev->ibdev.node_desc, HVND_NODE_DESC, sizeof(HVND_NODE_DESC)); > + memcpy(&dev->ibdev.node_guid, mac_addr, 6); > + dev->ibdev.phys_port_cnt = 1; //dev->nports; > + dev->ibdev.num_comp_vectors = 1; > + dev->ibdev.query_device = hvnd_query_device; > + dev->ibdev.query_port = hvnd_query_port; > + dev->ibdev.get_link_layer = hvnd_get_link_layer; > + dev->ibdev.query_pkey = hvnd_query_pkey; > + dev->ibdev.query_gid = hvnd_query_gid; > + dev->ibdev.alloc_ucontext = hvnd_alloc_ucontext; > + dev->ibdev.dealloc_ucontext = hvnd_dealloc_ucontext; > + dev->ibdev.mmap = hvnd_mmap; > + dev->ibdev.alloc_pd = hvnd_allocate_pd; > + dev->ibdev.dealloc_pd = hvnd_deallocate_pd; > + dev->ibdev.create_ah = hvnd_ah_create; > + dev->ibdev.destroy_ah = hvnd_ah_destroy; > + dev->ibdev.create_qp = hvnd_ib_create_qp; > + dev->ibdev.modify_qp = hvnd_ib_modify_qp; > + dev->ibdev.query_qp = hvnd_ib_query_qp; > + dev->ibdev.destroy_qp = hvnd_destroy_qp; > + dev->ibdev.create_cq = hvnd_ib_create_cq; > + dev->ibdev.destroy_cq = hvnd_ib_destroy_cq; > + dev->ibdev.resize_cq = hvnd_resize_cq; > + dev->ibdev.poll_cq = hvnd_poll_cq; > + dev->ibdev.get_dma_mr = hvnd_get_dma_mr; > + dev->ibdev.reg_user_mr = hvnd_reg_user_mr; > + dev->ibdev.dereg_mr = hvnd_dereg_mr; > + dev->ibdev.alloc_mw = hvnd_alloc_mw; > + dev->ibdev.dealloc_mw = hvnd_dealloc_mw; > + dev->ibdev.attach_mcast = hvnd_multicast_attach; > + dev->ibdev.detach_mcast = hvnd_multicast_detach; > + dev->ibdev.process_mad = hvnd_process_mad; > + dev->ibdev.req_notify_cq = hvnd_arm_cq; > + dev->ibdev.post_send = hvnd_post_send; > + dev->ibdev.post_recv = hvnd_post_receive; > + dev->ibdev.uverbs_abi_ver = MLX4_IB_UVERBS_ABI_VERSION; > + > + dev->ibdev.get_port_immutable = hvnd_get_port_immutable; > + > + //DMA ops for mapping all possible addresses > + dev->ibdev.dev.parent = &(dev->hvdev->device); > + dev->ibdev.dev.dma_ops = &vmbus_dma_ops; > + dev->ibdev.dev.dma_mask = (u64 *) DMA_BIT_MASK(64); > + dev->ibdev.dev.coherent_dma_mask = DMA_BIT_MASK(64); > + > + dev->ibdev.iwcm = kmalloc(sizeof(struct iw_cm_verbs), GFP_KERNEL); > + if (!dev->ibdev.iwcm) > + return -ENOMEM; > + > + dev->ibdev.iwcm->connect = hvnd_connect; > + dev->ibdev.iwcm->accept = hvnd_accept_cr; > + dev->ibdev.iwcm->reject = hvnd_reject_cr; > + dev->ibdev.iwcm->create_listen = hvnd_create_listen; > + dev->ibdev.iwcm->destroy_listen = hvnd_destroy_listen; > + dev->ibdev.iwcm->add_ref = hvnd_qp_add_ref; > + dev->ibdev.iwcm->rem_ref = hvnd_qp_rem_ref; > + dev->ibdev.iwcm->get_qp = hvnd_get_qp; > + > + strlcpy(dev->ibdev.name, "mlx4_%d", IB_DEVICE_NAME_MAX); > + ret = ib_register_device(&dev->ibdev, NULL); > + if (ret) { > + hvnd_error("ib_register_device failed ret=%d\n", ret); > + goto bail1; > + } > + > +#if 0 > + for (i = 0; i < ARRAY_SIZE(hvnd_class_attributes); ++i) { > + ret = device_create_file(&dev->ibdev.dev, > + hvnd_class_attributes[i]); > + if (ret) { > + hvnd_error("device_create_file failed ret=%d\n", ret); > + goto bail2; > + } > + } > +#endif > + > + dev->ib_active = true; > + return 0; > +//bail2: > +// ib_unregister_device(&dev->ibdev); > +bail1: > + kfree(dev->ibdev.iwcm); > + return ret; > +} > + > +void hvnd_unregister_device(struct hvnd_dev *dev) > +{ > + int i; > + > + for (i = 0; i < ARRAY_SIZE(hvnd_class_attributes); ++i) > + device_remove_file(&dev->ibdev.dev, > + hvnd_class_attributes[i]); > + ib_unregister_device(&dev->ibdev); > + kfree(dev->ibdev.iwcm); > + ib_dealloc_device((struct ib_device *)dev); > +} > + > +static int hvnd_try_bind_nic(unsigned char *mac, __be32 ip) > +{ > + int ret; > + struct hvnd_dev *nd_dev = g_nd_dev; > + > + mutex_lock(&nd_dev->bind_mutex); > + if (nd_dev->bind_complete) { > + mutex_unlock(&nd_dev->bind_mutex); > + return 1; > + } > + > + memcpy(nd_dev->mac_addr, mac, 6); > + *(__be32 *)(nd_dev->ip_addr) = ip; > + > + /* > + * Bind the NIC. > + */ > + hvnd_info("trying to bind to IP %pI4 MAC %pM\n", > + nd_dev->ip_addr, nd_dev->mac_addr); > + ret = hvnd_bind_nic(nd_dev, false, nd_dev->ip_addr, nd_dev->mac_addr); > + if (ret || nd_dev->bind_pkt.pkt_hdr.status) { > + mutex_unlock(&nd_dev->bind_mutex); > + return 1; > + } > + > + /* if we reach here, this means bind_nic is a success */ > + hvnd_error("successfully bound to IP %pI4 MAC %pM\n", > + nd_dev->ip_addr, nd_dev->mac_addr); > + complete(&nd_dev->addr_set); > + nd_dev->bind_complete = 1; > + mutex_unlock(&nd_dev->bind_mutex); > + > + ret = hvnd_register_device(nd_dev, nd_dev->ip_addr, nd_dev->mac_addr); > + > + if (!ret) > + return 0; > + > + hvnd_error("hvnd_register_device failed ret=%d\n", ret); > + > + /* roll back all allocated resources on error */ > + iounmap(nd_dev->mmio_virt); > + release_resource(&nd_dev->mmio_resource); > + > + vmbus_close(nd_dev->hvdev->channel); > + ib_dealloc_device((struct ib_device *)nd_dev); > + > + return 1; > +} > + > +static void hvnd_inetaddr_event_up(unsigned long event, struct in_ifaddr *ifa) > +{ > + hvnd_try_bind_nic(ifa->ifa_dev->dev->dev_addr, ifa->ifa_address); > +} > + > +static int hvnd_inetaddr_event(struct notifier_block *notifier, > + unsigned long event, void *ptr) > +{ > + struct in_ifaddr *ifa = ptr; > + > + switch (event) { > + case NETDEV_UP: > + hvnd_inetaddr_event_up(event, ifa); > + break; > + default: > + hvnd_debug("Received inetaddr event %lu\n", event); > + } > + > + return NOTIFY_DONE; > +} > + > +static struct notifier_block hvnd_inetaddr_notifier = { > + .notifier_call = hvnd_inetaddr_event, > +}; > + > +static int start_bind_nic(void) > +{ > + struct net_device *dev; > + struct in_device *idev; > + struct in_ifaddr *ifa; > + > + register_inetaddr_notifier(&hvnd_inetaddr_notifier); > + > + rtnl_lock(); > + for_each_netdev(&init_net, dev) { > + idev = in_dev_get(dev); > + if (!idev) > + continue; > + for (ifa = (idev)->ifa_list; > + ifa && !(ifa->ifa_flags&IFA_F_SECONDARY); > + ifa = ifa->ifa_next) { > + hvnd_try_bind_nic(dev->dev_addr, ifa->ifa_address); > + } > + } > + rtnl_unlock(); > + > + return 0; > +} > + > +static int hvnd_probe(struct hv_device *dev, > + const struct hv_vmbus_device_id *dev_id) > +{ > + struct hvnd_dev *nd_dev; > + int ret = 0; > + > + hvnd_debug("hvnd starting\n"); > + > + nd_dev = (struct hvnd_dev *)ib_alloc_device(sizeof(struct hvnd_dev)); > + if (!nd_dev) { > + ret = -ENOMEM; > + goto err_out0; > + } > + > + nd_dev->hvdev = dev; > + /* > + * We are going to masquerade as MLX4 device; > + * Set the vendor and device ID accordingly. > + */ > + dev->vendor_id = 0x15b3; //Mellanox > + dev->device_id = 0x1003; //Mellanox HCA > + INIT_LIST_HEAD(&nd_dev->listentry); > + spin_lock_init(&nd_dev->uctxt_lk); > + nd_dev->ib_active = false; > + > + /* > + * Initialize the state for the id table. > + */ > + spin_lock_init(&nd_dev->id_lock); > + idr_init(&nd_dev->cqidr); > + idr_init(&nd_dev->qpidr); > + idr_init(&nd_dev->mmidr); > + idr_init(&nd_dev->irpidr); > + idr_init(&nd_dev->uctxidr); > + > + atomic_set(&nd_dev->open_cnt, 0); > + > + sema_init(&nd_dev->query_pkt_sem, 1); > + > + ret = vmbus_open(dev->channel, HVND_RING_SZ, HVND_RING_SZ, NULL, 0, > + hvnd_callback, dev); > + > + if (ret) { > + hvnd_error("vmbus_open failed ret=%d\n", ret); > + goto err_out1; > + } > + > + hv_set_drvdata(dev, nd_dev); > + > + ret = hvnd_negotiate_version(nd_dev); > + > + if (ret) { > + hvnd_error("hvnd_negotiate_version failed ret=%d\n", ret); > + goto err_out2; > + } > + > + /* > + * Register resources with the host. > + */ > + ret = hvnd_init_resources(nd_dev); > + if (ret) { > + hvnd_error("hvnd_init_resources failed ret=%d\n", ret); > + goto err_out2; > + } > + > + /* > + * Try to bind every NIC to ND channel, > + * ND host will only return success for the correct one > + */ > + nd_dev->bind_complete = 0; > + mutex_init(&nd_dev->bind_mutex); > + init_completion(&nd_dev->addr_set); > + > + g_nd_dev = nd_dev; > + start_bind_nic(); > + > + return 0; > + > +err_out2: > + vmbus_close(dev->channel); > + > +err_out1: > + ib_dealloc_device((struct ib_device *)nd_dev); > + > +err_out0: > + return ret; > +} > + > +static int hvnd_remove(struct hv_device *dev) > +{ > + struct hvnd_dev *nd_dev = hv_get_drvdata(dev); > + > + unregister_inetaddr_notifier(&hvnd_inetaddr_notifier); > + hvnd_bind_nic(nd_dev, true, nd_dev->ip_addr, nd_dev->mac_addr); > + hvnd_unregister_device(nd_dev); > + vmbus_close(dev->channel); > + iounmap(nd_dev->mmio_virt); > + release_resource(&nd_dev->mmio_resource); > + return 0; > +} > + > +static const struct hv_vmbus_device_id id_table[] = { > + /* VMBUS RDMA class guid */ > + /* 8c2eaf3d-32a7-4b09-ab99-bd1f1c86b501 */ > + { HV_ND_GUID, }, > + { }, > +}; > + > +MODULE_DEVICE_TABLE(vmbus, id_table); > + > +static struct hv_driver hvnd_drv = { > + .name = "hv_guest_rdma", > + .id_table = id_table, > + .probe = hvnd_probe, > + .remove = hvnd_remove, > +}; > + > + > +static int __init init_hvnd_drv(void) > +{ > + > + pr_info("Registered HyperV networkDirect Driver\n"); > + hvnd_addr_init(); > + return vmbus_driver_register(&hvnd_drv); > +} > + > +static void exit_hvnd_drv(void) > +{ > + pr_info("De-Registered HyperV networkDirect Driver\n"); > + hvnd_addr_deinit(); > + vmbus_driver_unregister(&hvnd_drv); > +} > + > + > +module_init(init_hvnd_drv); > +module_exit(exit_hvnd_drv); > + > +MODULE_DESCRIPTION("Hyper-V NetworkDirect Driver"); > +MODULE_LICENSE("GPL"); > +MODULE_VERSION("4.1.3"); > diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/user.h b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/user.h > new file mode 100644 > index 000000000000..07e6769ef43b > --- /dev/null > +++ b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/user.h > @@ -0,0 +1,107 @@ > +/* > + * Copyright (c) 2007 Cisco Systems, Inc. All rights reserved. > + * Copyright (c) 2007, 2008 Mellanox Technologies. All rights reserved. > + * > + * This software is available to you under a choice of one of two > + * licenses. You may choose to be licensed under the terms of the GNU > + * General Public License (GPL) Version 2, available from the file > + * COPYING in the main directory of this source tree, or the > + * OpenIB.org BSD license below: > + * > + * Redistribution and use in source and binary forms, with or > + * without modification, are permitted provided that the following > + * conditions are met: > + * > + * - Redistributions of source code must retain the above > + * copyright notice, this list of conditions and the following > + * disclaimer. > + * > + * - Redistributions in binary form must reproduce the above > + * copyright notice, this list of conditions and the following > + * disclaimer in the documentation and/or other materials > + * provided with the distribution. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, > + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF > + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND > + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS > + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN > + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN > + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE > + * SOFTWARE. > + */ > + > +#ifndef MLX4_IB_USER_H > +#define MLX4_IB_USER_H > + > +#include <linux/types.h> > + > +/* > + * Increment this value if any changes that break userspace ABI > + * compatibility are made. > + */ > + > +#define MLX4_IB_UVERBS_NO_DEV_CAPS_ABI_VERSION 3 > +#define MLX4_IB_UVERBS_ABI_VERSION 4 > + > +/* > + * Make sure that all structs defined in this file remain laid out so > + * that they pack the same way on 32-bit and 64-bit architectures (to > + * avoid incompatibility between 32-bit userspace and 64-bit kernels). > + * In particular do not use pointer types -- pass pointers in __u64 > + * instead. > + */ > + > +struct mlx4_ib_alloc_ucontext_resp_v3 { > + __u32 qp_tab_size; > + __u16 bf_reg_size; > + __u16 bf_regs_per_page; > +}; > + > +struct mlx4_ib_alloc_ucontext_resp { > + __u32 dev_caps; > + __u32 qp_tab_size; > + __u16 bf_reg_size; > + __u16 bf_regs_per_page; > + __u32 cqe_size; > +}; > + > +struct mlx4_ib_alloc_pd_resp { > + __u32 pdn; > + __u32 reserved; > +}; > + > +struct mlx4_ib_create_cq { > + __u64 buf_addr; > + __u64 db_addr; > +}; > + > +struct mlx4_ib_create_cq_resp { > + __u32 cqn; > + __u32 reserved; > +}; > + > +struct mlx4_ib_resize_cq { > + __u64 buf_addr; > +}; > + > +struct mlx4_ib_create_srq { > + __u64 buf_addr; > + __u64 db_addr; > +}; > + > +struct mlx4_ib_create_srq_resp { > + __u32 srqn; > + __u32 reserved; > +}; > + > +struct mlx4_ib_create_qp { > + __u64 buf_addr; > + __u64 db_addr; > + __u8 log_sq_bb_count; > + __u8 log_sq_stride; > + __u8 sq_no_prefetch; > + __u8 reserved[5]; > +}; > + > +#endif /* MLX4_IB_USER_H */ > diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/vmbus_rdma.c b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/vmbus_rdma.c > new file mode 100644 > index 000000000000..daf68dcccb1b > --- /dev/null > +++ b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/vmbus_rdma.c > @@ -0,0 +1,3074 @@ > +/* > + * Copyright (c) 2014, Microsoft Corporation. > + * > + * Author: > + * K. Y. Srinivasan <kys@microsoft.com> > + * > + * This program is free software; you can redistribute it and/or modify it > + * under the terms of the GNU General Public License version 2 as published > + * by the Free Software Foundation. > + * > + * This program is distributed in the hope that it will be useful, but > + * WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or > + * NON INFRINGEMENT. See the GNU General Public License for more > + * details. > + * > + * Bug fixes/enhancements: Long Li <longli@microsoft.com> > + */ > + > +#include <linux/completion.h> > +#include <linux/module.h> > +#include <linux/errno.h> > +#include <linux/hyperv.h> > +#include <linux/efi.h> > +#include <linux/slab.h> > +#include <linux/cred.h> > +#include <linux/sched.h> > +#include <linux/types.h> > +#include <linux/scatterlist.h> > +#include <rdma/ib_umem.h> > +#include <rdma/ib_user_verbs.h> > +#include <asm-generic/delay.h> > + > +#include "vmbus_rdma.h" > + > +/* > + * We only have a single rdma device on the host; > + * have a single receive buffer. > + */ > + > + > +static char hvnd_recv_buffer[PAGE_SIZE * 4]; > + > +static atomic_t irp_local_hdl; > + > +char *hvnd_get_op_name(int ioctl) > +{ > + switch (ioctl) { > + case IOCTL_ND_PROVIDER_INIT: > + return "IOCTL_ND_PROVIDER_INIT\n"; > + case IOCTL_ND_PROVIDER_BIND_FILE: > + return "IOCTL_ND_PROVIDER_BIND_FILE\n"; > + case IOCTL_ND_ADAPTER_OPEN: > + return "IOCTL_ND_ADAPTER_OPEN\n"; > + > + case IOCTL_ND_ADAPTER_CLOSE: > + return "IOCTL_ND_ADAPTER_CLOSE\n"; > + > + case IOCTL_ND_ADAPTER_QUERY: > + return "IOCTL_ND_ADAPTER_QUERY\n"; > + > + case IOCTL_ND_PD_CREATE: > + return "IOCTL_ND_PD_CREATE\n"; > + > + case IOCTL_ND_PD_FREE: > + return "IOCTL_ND_PD_FREE\n"; > + > + case IOCTL_ND_CQ_CREATE: > + return "IOCTL_ND_CQ_CREATE\n"; > + > + case IOCTL_ND_CQ_FREE: > + return "IOCTL_ND_CQ_FREE\n"; > + case IOCTL_ND_CQ_CANCEL_IO: > + return "IOCTL_ND_CQ_CANCEL_IO\n"; > + case IOCTL_ND_CQ_GET_AFFINITY: > + return "IOCTL_ND_CQ_GET_AAFINITY\n"; > + case IOCTL_ND_CQ_MODIFY: > + return "IOCTL_ND_CQ_MODIFY\n"; > + > + case IOCTL_ND_CQ_NOTIFY: > + return "IOCTL_ND_CQ_NOTIFY\n"; > + > + > + case IOCTL_ND_LISTENER_CREATE: > + return "IOCTL_ND_LISTENER_CREATE\n"; > + > + case IOCTL_ND_LISTENER_FREE: > + return "IOCTL_ND_LISTENER_FREE\n"; > + > + case IOCTL_ND_QP_FREE: > + return "IOCTL_ND_QP_FREE\n"; > + > + case IOCTL_ND_CONNECTOR_CANCEL_IO: > + return "IOCTL_ND_CONNECTOR_CANCEL_IO\n"; > + > + case IOCTL_ND_LISTENER_CANCEL_IO: > + return "IOCTL_ND_LISTENER_CANCEL_IO\n"; > + > + case IOCTL_ND_LISTENER_BIND: > + return "IOCTL_ND_LISTENER_BIND\n"; > + > + case IOCTL_ND_LISTENER_LISTEN: > + return "IOCTL_ND_LISTENER_LISTEN\n"; > + > + case IOCTL_ND_LISTENER_GET_ADDRESS: > + return "IOCTL_ND_LISTENER_GET_ADDRESS\n"; > + > + case IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST: > + return "IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST\n"; > + > + > + > + case IOCTL_ND_CONNECTOR_CREATE: > + return "IOCTL_ND_CONNECTOR_CREATE\n"; > + > + case IOCTL_ND_CONNECTOR_FREE: > + return "IOCTL_ND_CONNECTOR_FREE\n"; > + > + case IOCTL_ND_CONNECTOR_BIND: > + return "IOCTL_ND_CONNECTOR_BIND\n"; > + > + // KYS: ALERT: ASYNCH Operation > + case IOCTL_ND_CONNECTOR_CONNECT: > + return "IOCTL_ND_CONNECTOR_CONNECT\n"; > + > + case IOCTL_ND_CONNECTOR_COMPLETE_CONNECT: > + return "IOCTL_ND_CONNECTOR_COMPLETE_CONNECT\n"; > + > + // KYS: ALERT: ASYNCH Operation > + case IOCTL_ND_CONNECTOR_ACCEPT: > + return "IOCTL_ND_CONNECTOR_ACCEPT\n"; > + > + case IOCTL_ND_CONNECTOR_REJECT: > + return "IOCTL_ND_CONNECTOR_REJECT\n"; > + > + case IOCTL_ND_CONNECTOR_GET_READ_LIMITS: > + return "IOCTL_ND_CONNECTOR_GET_READ_LIMITS\n"; > + > + case IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA: > + return "IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA\n"; > + > + case IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS: > + return "IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS\n"; > + > + case IOCTL_ND_CONNECTOR_GET_ADDRESS: > + return "IOCTL_ND_CONNECTOR_GET_ADDRESS\n"; > + > + // KYS: ALERT: ASYNCH Operation > + case IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT: > + return "IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT\n"; > + > + // KYS: ALERT: ASYNCH Operation > + case IOCTL_ND_CONNECTOR_DISCONNECT: > + return "IOCTL_ND_CONNECTOR_DISCONNECT\n"; > + > + > + > + case IOCTL_ND_QP_CREATE: > + return "IOCTL_ND_QP_CREATE\n"; > + > + case IOCTL_ND_MR_CREATE: > + return "IOCTL_ND_MR_CREATE\n"; > + > + case IOCTL_ND_MR_FREE: > + return "IOCTL_ND_MR_FREE\n"; > + case IOCTL_ND_MR_REGISTER: > + return "IOCTL_ND_MR_REGISTER\n"; > + case IOCTL_ND_MR_DEREGISTER: > + return "IOCTL_ND_MR_DEREGISTER\n"; > + case IOCTL_ND_MR_CANCEL_IO: > + return "IOCTL_ND_MR_CANCEL_IO\n"; > + case IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST: > + return "IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST\n"; > + case IOCTL_ND_QP_FLUSH: > + return "IOCTL_ND_QP_FLUSH\n"; > + > + default: > + return "Unknown IOCTL\n"; > + } > +} > +int get_irp_handle(struct hvnd_dev *nd_dev, u32 *local, void *irp_ctx) > +{ > + unsigned int local_handle; > + int ret; > + > + local_handle = atomic_inc_return(&irp_local_hdl); > + *local = local_handle; > + > + /* > + * Now asssociate the local handle with the pointer. > + */ > + ret = insert_handle(nd_dev, &nd_dev->irpidr, irp_ctx, local_handle); > + hvnd_debug("irp_ctx=%p local_handle=%u\n", irp_ctx, local_handle); > + > + if (ret) { > + hvnd_error("insert_handle failed ret=%d\n", ret); > + return ret; > + } > + > + return 0; > +} > + > +void put_irp_handle(struct hvnd_dev *nd_dev, u32 irp) > +{ > + remove_handle(nd_dev, &nd_dev->irpidr, irp); > + > +} > + > +static void init_pfn(u64 *pfn, void *addr, u32 length) > +{ > + int i; > + u32 offset = offset_in_page(addr); > + u32 num_pfn = DIV_ROUND_UP(offset + length, PAGE_SIZE); > + > + for (i = 0; i < num_pfn; i++) > + pfn[i] = virt_to_phys((u8 *)addr + > + (PAGE_SIZE * i)) >> PAGE_SHIFT; > +} > + > + > +static void user_va_init_pfn(u64 *pfn, struct ib_umem *umem) > +{ > + int entry; > + struct scatterlist *sg; > + int i = 0; > + > + for_each_sg(umem->sg_head.sgl, sg, umem->nmap, entry) { > + pfn[i++] = page_to_pfn(sg_page(sg)); > + } > +} > + > +static u32 get_local_handle(void *p) > +{ > + u64 val = (unsigned long)p; > + > + return (u32)val; > +} > + > +static int hvnd_send_pg_buffer(struct hvnd_dev *nd_dev, > + struct vmbus_packet_mpb_array *desc, > + u32 desc_size, > + void *buffer, > + u32 bufferlen, u64 cookie) > +{ > + int ret; > + int t; > + struct hvnd_cookie hvnd_cookie; > + > + hvnd_cookie.pkt = (void *)cookie; > + init_completion(&hvnd_cookie.host_event); > + > + ret = vmbus_sendpacket_mpb_desc(nd_dev->hvdev->channel, > + desc, > + desc_size, > + buffer, bufferlen, > + (u64)(&hvnd_cookie)); > + > + if (ret) { > + hvnd_error("vmbus_sendpacket_mpb_desc failed ret=%d\n", ret); > + goto err; > + } > + > + t = wait_for_completion_timeout(&hvnd_cookie.host_event, 500*HZ); > + > + if (t == 0) { > + hvnd_error("wait_for_completion_timeout timed out\n"); > + ret = -ETIMEDOUT; > + } > + > +err: > + return ret; > +} > + > +static int hvnd_send_packet(struct hvnd_dev *nd_dev, void *buffer, > + u32 bufferlen, u64 cookie, bool block) > +{ > + int ret; > + int t; > + struct hvnd_cookie hvnd_cookie; > + > + hvnd_cookie.pkt = (void *)cookie; > + init_completion(&hvnd_cookie.host_event); > + > + ret = vmbus_sendpacket(nd_dev->hvdev->channel, buffer, bufferlen, > + (u64)(&hvnd_cookie), VM_PKT_DATA_INBAND, > + VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED); > + > + if (ret) { > + hvnd_error("vmbus_send pkt failed: %d\n", ret); > + goto err; > + } > + > + if (!block) > + return ret; > + > + t = wait_for_completion_timeout(&hvnd_cookie.host_event, 500*HZ); > + > + if (t == 0) { > + hvnd_error("wait_for_completion_timeout timed out\n"); > + ret = -ETIMEDOUT; > + } > + > +err: > + return ret; > +} > + > +static int hvnd_send_pgbuf_ioctl_pkt(struct hvnd_dev *nd_dev, > + struct vmbus_packet_mpb_array *desc, > + u32 desc_size, > + struct ndv_packet_hdr_control_1 *hdr, > + u32 pkt_size, u64 cookie) > +{ > + int ret; > + int ioctl; > + > + ioctl = hdr->io_cntrl_code; > + > + > + ret = hvnd_send_pg_buffer(nd_dev, desc, desc_size, > + hdr, pkt_size, cookie); > + > + if (ret) > + return ret; > + > + if (hdr->pkt_hdr.status != 0) { > + hvnd_error("IOCTL: %s failed; status is %x\n", > + hvnd_get_op_name(ioctl), > + hdr->pkt_hdr.status); > + return -EINVAL; > + } > + > + switch (hdr->io_status) { > + case STATUS_SUCCESS: > + case STATUS_PENDING: > + return 0; > + > + default: > + hvnd_error("IOCTL: %s failed io status is %x\n", > + hvnd_get_op_name(ioctl), hdr->io_status); > + return -EINVAL; > + } > +} > + > +int hvnd_send_ioctl_pkt(struct hvnd_dev *nd_dev, > + struct ndv_packet_hdr_control_1 *hdr, > + u32 pkt_size, u64 cookie) > +{ > + int ret; > + int ioctl; > + bool block; > + > + block = (hdr->irp_handle.val64 == 0) ? true : false; > + > + > + ioctl = hdr->io_cntrl_code; > + > + ret = hvnd_send_packet(nd_dev, hdr, pkt_size, cookie, block); > + > + if (ret) > + return ret; > + > + if (!block) > + return ret; > + > + if (hdr->pkt_hdr.status != 0) { > + hvnd_error("IOCTL: %s failed; status is %x\n", > + hvnd_get_op_name(ioctl), hdr->pkt_hdr.status); > + return -EINVAL; > + } > + > + switch (hdr->io_status) { > + case STATUS_SUCCESS: > + case STATUS_PENDING: > + return 0; > + > + default: > + hvnd_warn("IOCTL: %s failed io status is %x\n", > + hvnd_get_op_name(ioctl), hdr->io_status); > + return -EINVAL; > + } > +} > + > +void hvnd_init_hdr(struct ndv_packet_hdr_control_1 *hdr, > + u32 data_sz, u32 local, u32 remote, > + u32 ioctl_code, > + u32 ext_data_sz, u32 ext_data_offset, > + u64 irp_handle) > + > +{ > + int pkt_type; > + > + pkt_type = NDV_PKT_ID1_CONTROL; > + NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); > + hdr->pkt_hdr.packet_type = pkt_type; > + hdr->pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); > + hdr->pkt_hdr.data_sz = data_sz; > + > + hdr->pkt_hdr.status = 0; > + > + hdr->file_handle.local = local; > + hdr->file_handle.remote = remote; > + hdr->irp_handle.val64 = irp_handle; > + > + hdr->io_cntrl_code = ioctl_code; > + hdr->output_buf_sz = data_sz - ext_data_sz; > + hdr->input_buf_sz = data_sz - ext_data_sz; > + > + hdr->input_output_buf_offset = 0; > + > + hdr->extended_data.size = ext_data_sz; > + hdr->extended_data.offset = ext_data_offset; > +} > + > + > +int hvnd_create_file(struct hvnd_dev *nd_dev, void *uctx, > + struct ndv_pkt_hdr_create_1 *create, u32 file_flags) > +{ > + int ret; > + int pkt_type; > + > + > + pkt_type = NDV_PKT_ID1_CREATE; > + NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); > + create->pkt_hdr.packet_type = pkt_type; > + create->pkt_hdr.hdr_sz = sizeof(struct ndv_pkt_hdr_create_1); > + create->pkt_hdr.data_sz = 0; > + > + create->handle.local = get_local_handle(uctx); > + create->access_mask = STANDARD_RIGHTS_ALL; > + create->open_options = OPEN_EXISTING; > + create->file_attributes = FILE_ATTRIBUTE_NORMAL | file_flags; > + create->share_access = FILE_SHARE_ALL; > + > + ret = hvnd_send_packet(nd_dev, create, > + sizeof(struct ndv_pkt_hdr_create_1), > + (unsigned long)create, true); > + return ret; > +} > + > +int hvnd_cleanup_file(struct hvnd_dev *nd_dev, u32 local, u32 remote) > +{ > + int ret; > + int pkt_type; > + struct ndv_pkt_hdr_cleanup_1 cleanup_pkt; > + > + > + pkt_type = NDV_PKT_ID1_CLEANUP; > + NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); > + > + cleanup_pkt.pkt_hdr.packet_type = pkt_type; > + cleanup_pkt.pkt_hdr.hdr_sz = sizeof(struct ndv_pkt_hdr_create_1); > + cleanup_pkt.pkt_hdr.data_sz = 0; > + > + cleanup_pkt.handle.local = local; > + cleanup_pkt.handle.remote = remote; > + > + ret = hvnd_send_packet(nd_dev, &cleanup_pkt, > + sizeof(struct ndv_pkt_hdr_create_1), > + (unsigned long)&cleanup_pkt, true); > + return ret; > +} > + > + > +static int hvnd_do_ioctl(struct hvnd_dev *nd_dev, u32 ioctl, > + struct pkt_nd_provider_ioctl *pkt, > + union ndv_context_handle *hdr_handle, > + struct nd_handle *ioctl_handle, > + u8 *buf, u32 buf_len, bool c_in, bool c_out, u64 irp_val) > +{ > + int ret; > + int pkt_type; > + > + pkt_type = NDV_PKT_ID1_CONTROL; > + NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); > + > + pkt->hdr.pkt_hdr.packet_type = pkt_type; > + pkt->hdr.pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); > + pkt->hdr.pkt_hdr.data_sz = (sizeof(struct pkt_nd_provider_ioctl) - > + sizeof(struct ndv_packet_hdr_control_1)); > + > + pkt->hdr.file_handle.local = hdr_handle->local; > + pkt->hdr.file_handle.remote = hdr_handle->remote; > + hvnd_debug("create handle local: %x remote: %x\n", > + hdr_handle->local, hdr_handle->remote); > + > + pkt->hdr.irp_handle.val64 = irp_val; > + > + pkt->hdr.io_cntrl_code = ioctl; > + pkt->hdr.output_buf_sz = sizeof(struct nd_ioctl); > + pkt->hdr.input_buf_sz = sizeof(struct nd_ioctl); > + pkt->hdr.input_output_buf_offset = 0; > + memset(&pkt->ioctl.handle, 0, sizeof(struct nd_handle)); > + pkt->ioctl.handle.version = ND_VERSION_1; > + > + switch (ioctl) { > + case IOCTL_ND_PROVIDER_BIND_FILE: > + pkt->ioctl.handle.handle = ioctl_handle->handle; > + break; > + default: > + break; > + }; > + > + /* > + * Copy the input buffer, if needed. > + */ > + > + if (c_in && (buf != NULL)) > + memcpy(pkt->ioctl.raw_buffer, buf, buf_len); > + > + ret = hvnd_send_packet(nd_dev, pkt, > + sizeof(struct pkt_nd_provider_ioctl), > + (unsigned long)pkt, true); > + > + if (ret) > + return ret; > + > + if (c_out && (buf != NULL)) > + memcpy(buf, pkt->ioctl.raw_buffer, buf_len); > + > + return ret; > +} > + > +static int idr_callback(int id, void *p, void *data) > +{ > + if (p == data) > + return id; > + return 0; > +} > + > +void remove_uctx(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx) > +{ > + int pid = current_pid(); > + unsigned long flags; > + int id; > + > + if (get_uctx(nd_dev, pid) == uctx) > + remove_handle(nd_dev, &nd_dev->uctxidr, pid); > + else { > + hvnd_warn("uctx %p not found on pid %d, doing a idr search\n", > + uctx, current_pid()); > + > + spin_lock_irqsave(&nd_dev->id_lock, flags); > + id = idr_for_each(&nd_dev->uctxidr, idr_callback, uctx); > + spin_unlock_irqrestore(&nd_dev->id_lock, flags); > + > + if (id) > + remove_handle(nd_dev, &nd_dev->uctxidr, id); > + else { > + hvnd_error("uctx %p not found in idr table\n", uctx); > + return; > + } > + } > + > + kfree(uctx); > +} > + > +int hvnd_close_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx) > +{ > + int ret; > + > + /* > + * First close the adaptor. > + */ > + > + ret = hvnd_free_handle(nd_dev, uctx, > + uctx->adaptor_hdl, > + IOCTL_ND_ADAPTER_CLOSE); > + > + if (ret) > + hvnd_error("Adaptor close failed; ret is %x\n", ret); > + > + /* > + * Now close the two files we created. > + */ > + > + ret = hvnd_cleanup_file(nd_dev, uctx->file_handle_ovl.local, > + uctx->file_handle_ovl.remote); > + > + if (ret) > + hvnd_error("file cleanup failed; ret is %x\n", ret); > + > + ret = hvnd_cleanup_file(nd_dev, uctx->file_handle.local, > + uctx->file_handle.remote); > + > + if (ret) > + hvnd_error("File cleanup failed; ret is %x\n", ret); > + > + /* > + * Remove the uctx from the ID table. > + */ > + remove_uctx(nd_dev, uctx); > + > + return 0; > +} > + > +int hvnd_open_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx) > +{ > + int ret; > + struct pkt_nd_provider_ioctl *pr_init = &uctx->pr_init_pkt; > + int pkt_type; > + struct nd_handle ioctl_handle; > + struct pkt_nd_open_adapter *pr_o_adap = &uctx->o_adap_pkt; > + > + ret = hvnd_create_file(nd_dev, uctx, &uctx->create_pkt, 0); > + if (ret) { > + hvnd_error("hvnd_create_file failed ret=%d\n", ret); > + goto error_cr; > + } > + > + if (uctx->create_pkt.pkt_hdr.status != 0) { > + hvnd_error("create File failed; status is %d\n", > + uctx->create_pkt.pkt_hdr.status); > + ret = -EINVAL; > + goto error_cr; > + } > + > + uctx->file_handle.local = uctx->create_pkt.handle.local; > + uctx->file_handle.remote = uctx->create_pkt.handle.remote; > + hvnd_debug("INITIALIZE PROVIDER\n"); > + /* > + * Now Initialize the Provider. > + */ > + ioctl_handle.handle = 0; > + ret = hvnd_do_ioctl(nd_dev, IOCTL_ND_PROVIDER_INIT, pr_init, > + &uctx->create_pkt.handle, > + &ioctl_handle, NULL, 0, false, false, 0); > + > + if (ret) { > + ret = -EINVAL; > + goto error_pr_init; > + } > + > + if (pr_init->hdr.pkt_hdr.status != 0) { > + hvnd_error("Provider INIT failed; status is %d\n", > + pr_init->hdr.pkt_hdr.status); > + ret = -EINVAL; > + goto error_pr_init; > + } > + > + if (pr_init->hdr.io_status != 0) { > + hvnd_error("Provider INIT failed; io status is %d\n", > + pr_init->hdr.io_status); > + ret = -EINVAL; > + goto error_pr_init; > + } > + > + /* > + * Now create the overlap file. > + */ > + > + hvnd_debug("CREATE OVERLAP FILE\n"); > + ret = hvnd_create_file(nd_dev, uctx, &uctx->create_pkt_ovl, > + FILE_FLAG_OVERLAPPED); > + if (ret) { > + hvnd_error("hvnd_create_file failed ret=%d\n", ret); > + goto error_pr_init; > + } > + > + if (uctx->create_pkt_ovl.pkt_hdr.status != 0) { > + hvnd_error("create Overlap File failed; status is %d\n", > + uctx->create_pkt_ovl.pkt_hdr.status); > + ret = -EINVAL; > + goto error_pr_init; > + } > + uctx->file_handle_ovl.local = uctx->create_pkt_ovl.handle.local; > + uctx->file_handle_ovl.remote = uctx->create_pkt_ovl.handle.remote; > + > + /* > + * Now bind the two file handles together. > + */ > + > + hvnd_debug("BIND FILE IOCTL remote handle: %d local handle: %d\n", > + uctx->create_pkt_ovl.handle.remote, > + uctx->create_pkt_ovl.handle.local); > + > + ioctl_handle.handle = uctx->create_pkt_ovl.handle.val64; > + ret = hvnd_do_ioctl(nd_dev, IOCTL_ND_PROVIDER_BIND_FILE, pr_init, > + &uctx->create_pkt.handle, > + &ioctl_handle, NULL, 0, false, false, 0); > + > + if (ret) { > + ret = -EINVAL; > + goto error_file_bind; > + } > + if (pr_init->hdr.pkt_hdr.status != 0) { > + hvnd_error("Provider File bind failed; status is %d\n", > + pr_init->hdr.pkt_hdr.status); > + ret = -EINVAL; > + goto error_file_bind; > + } > + if (pr_init->hdr.io_status != 0) { > + hvnd_error("Provider INIT failed; io status is %d\n", > + pr_init->hdr.io_status); > + ret = -EINVAL; > + goto error_file_bind; > + } > + > + /* > + * Now open the adaptor. > + */ > + > + hvnd_debug("OPENING THE ADAPTOR\n"); > + > + pkt_type = NDV_PKT_ID1_CONTROL; > + NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); > + pr_o_adap->hdr.pkt_hdr.packet_type = pkt_type; > + pr_o_adap->hdr.pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); > + pr_o_adap->hdr.pkt_hdr.data_sz = sizeof(struct pkt_nd_open_adapter) - > + sizeof(struct ndv_packet_hdr_control_1); > + > + pr_o_adap->hdr.pkt_hdr.status = 0; > + > + hvnd_debug("hdr sz is %d\n", pr_o_adap->hdr.pkt_hdr.hdr_sz); > + hvnd_debug("data sz is %d\n", pr_o_adap->hdr.pkt_hdr.data_sz); > + > + pr_o_adap->hdr.file_handle.local = uctx->create_pkt.handle.local; > + pr_o_adap->hdr.file_handle.remote = uctx->create_pkt.handle.remote; > + hvnd_debug("create handle local is %x\n", > + uctx->create_pkt.handle.local); > + hvnd_debug("create handle remote is %x\n", > + uctx->create_pkt.handle.remote); > + pr_o_adap->hdr.irp_handle.val64 = 0; > + > + pr_o_adap->hdr.io_cntrl_code = IOCTL_ND_ADAPTER_OPEN; > + pr_o_adap->hdr.output_buf_sz = pr_o_adap->hdr.pkt_hdr.data_sz - > + sizeof(struct extended_data_oad); > + pr_o_adap->hdr.input_buf_sz = pr_o_adap->hdr.pkt_hdr.data_sz - > + sizeof(struct extended_data_oad); > + > + hvnd_debug("output buf sz is %d\n", pr_o_adap->hdr.output_buf_sz); > + hvnd_debug("input buf sz is %d\n", pr_o_adap->hdr.input_buf_sz); > + hvnd_debug("packet size is %d\n", > + (int)sizeof(struct pkt_nd_open_adapter)); > + > + pr_o_adap->hdr.input_output_buf_offset = 0; > + > + > + pr_o_adap->hdr.extended_data.size = sizeof(struct extended_data_oad); > + pr_o_adap->hdr.extended_data.offset = > + offsetof(struct pkt_nd_open_adapter, ext_data) - > + sizeof(struct ndv_packet_hdr_control_1); > + > + hvnd_debug("size of the extended data size: %d\n", > + (int)sizeof(struct extended_data_oad)); > + hvnd_debug("offset of extended data: %d\n", > + pr_o_adap->hdr.extended_data.offset); > + > + /* > + * Now fill out the ioctl section. > + */ > + > + pr_o_adap->ioctl.input.version = ND_VERSION_1; > + pr_o_adap->ioctl.input.ce_mapping_cnt = > + RTL_NUMBER_OF(pr_o_adap->mappings.ctx_input.mappings); > + > + hvnd_debug("ce_mapping cnt is %d\n", > + pr_o_adap->ioctl.input.ce_mapping_cnt); > + > + pr_o_adap->ioctl.input.cb_mapping_offset = sizeof(union oad_ioctl); > + hvnd_debug("cb_mapping offset is %d\n", > + pr_o_adap->ioctl.input.cb_mapping_offset); > + pr_o_adap->ioctl.input.adapter_id = (u64)nd_dev; > + > + pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_UAR].map_type = ND_MAP_IOSPACE; > + pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_UAR].map_io_space.cache_type = ND_NON_CACHED; > + pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_UAR].map_io_space.cb_length = 4096; > + > + pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_BF].map_type = ND_MAP_IOSPACE; > + pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_BF].map_io_space.cache_type = ND_WRITE_COMBINED; > + pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_BF].map_io_space.cb_length = 4096; > + > + /* > + * Fill in the extended data. > + */ > + pr_o_adap->ext_data.cnt = IBV_GET_CONTEXT_MAPPING_MAX; > + > + ret = hvnd_send_packet(nd_dev, pr_o_adap, > + sizeof(struct pkt_nd_open_adapter), > + (unsigned long)pr_o_adap, true); > + if (ret) { > + ret = -EINVAL; > + goto error_file_bind; > + } > + > + if (pr_o_adap->hdr.pkt_hdr.status != 0) { > + hvnd_error("Open adaptor failed; status is %d\n", > + pr_o_adap->hdr.pkt_hdr.status); > + ret = -EINVAL; > + goto error_file_bind; > + } > + > + if (pr_o_adap->hdr.io_status != 0) { > + hvnd_error("Open adaptor failed;io status is %d\n", > + pr_o_adap->hdr.io_status); > + ret = -EINVAL; > + goto error_file_bind; > + } > + > + /* > + * Copy the necessary response from the host. > + */ > + > + uctx->adaptor_hdl = pr_o_adap->ioctl.resrc_desc.handle; > + > + > + hvnd_debug("adaptor handle: %p\n", (void *)uctx->adaptor_hdl); > + > + uctx->uar_base = > + pr_o_adap->mappings.ctx_output.mapping_results[IBV_GET_CONTEXT_UAR].info; > + hvnd_debug("uar base: %p\n", (void *)uctx->uar_base); > + > + uctx->bf_base = > + pr_o_adap->mappings.ctx_output.mapping_results[IBV_GET_CONTEXT_BF].info; > + hvnd_debug("bf base: %p\n", (void *)uctx->bf_base); > + > + uctx->bf_buf_size = > + pr_o_adap->mappings.ctx_output.bf_buf_size; > + hvnd_debug("bf buf size: %d\n", uctx->bf_buf_size); > + > + uctx->bf_offset = > + pr_o_adap->mappings.ctx_output.bf_offset; > + hvnd_debug("bf offset: %d\n", uctx->bf_offset); > + > + uctx->cqe_size = > + pr_o_adap->mappings.ctx_output.cqe_size; > + hvnd_debug("cqe size: %d\n", uctx->cqe_size); > + > + uctx->max_qp_wr = > + pr_o_adap->mappings.ctx_output.max_qp_wr; > + hvnd_debug("max qp wr: %d\n", uctx->max_qp_wr); > + > + uctx->max_sge = > + pr_o_adap->mappings.ctx_output.max_sge; > + hvnd_debug("max sge: %d\n", uctx->max_sge); > + > + uctx->max_cqe = > + pr_o_adap->mappings.ctx_output.max_cqe; > + hvnd_debug("max cqe: %d\n", uctx->max_cqe); > + > + uctx->num_qps = > + pr_o_adap->mappings.ctx_output.qp_tab_size; > + hvnd_debug("num qps: %d\n", uctx->num_qps); > + > + /* > + * Now query the adaptor and stash away the adaptor info. > + */ > + > + ret = hvnd_query_adaptor(nd_dev, uctx); > + if (ret) { > + hvnd_error("Query Adaptor failed; ret is %d\n", ret); > + goto query_err; > + } > + > + return ret; > + > +query_err: > + hvnd_free_handle(nd_dev, uctx, > + uctx->adaptor_hdl, > + IOCTL_ND_ADAPTER_CLOSE); > + > + hvnd_error("Open Adaptor Failed!!\n"); > + > +error_file_bind: > + hvnd_cleanup_file(nd_dev, uctx->file_handle_ovl.local, > + uctx->file_handle_ovl.remote); > + > +error_pr_init: > + hvnd_cleanup_file(nd_dev, uctx->file_handle.local, > + uctx->file_handle.remote); > + > +error_cr: > + if (get_uctx(nd_dev, current_pid()) != NULL) > + remove_handle(nd_dev, &nd_dev->uctxidr, current_pid()); > + > + return ret; > +} > + > +int hvnd_create_cq(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + struct hvnd_cq *cq) > +{ > + int ret; > + struct pkt_nd_create_cq *pkt; > + int num_pfn, num_db_pfn, num_sn_pfn; > + int cq_pkt_size; > + unsigned int cq_buf_size, offset; > + u32 ext_data_sz; > + u32 ext_data_offset; > + > + /* > + * Now create CQ. > + * First compute the number of PFNs we need to accomodate: > + * One each for door bell and arm_sn and pages in cq buffer. > + */ > + cq_buf_size = (cq->entries * uctx->cqe_size); > + offset = offset_in_page(cq->cq_buf); > + num_pfn = DIV_ROUND_UP(offset + cq_buf_size, PAGE_SIZE); > + > + offset = offset_in_page(cq->db_addr); > + num_db_pfn = DIV_ROUND_UP(offset + 8, PAGE_SIZE); > + > + offset = offset_in_page(&cq->arm_sn); > + num_sn_pfn = DIV_ROUND_UP(offset + 4, PAGE_SIZE); > + > + cq_pkt_size = sizeof(struct pkt_nd_create_cq) + > + (num_pfn * sizeof(u64)); > + > + ext_data_sz = sizeof(struct create_cq_ext_data) + > + (num_pfn * sizeof(u64)); > + ext_data_offset = offsetof(struct pkt_nd_create_cq, ext_data) - > + sizeof(struct ndv_packet_hdr_control_1); > + > + hvnd_debug("CREATE CQ, num user addr pfns is %d\n", num_pfn); > + hvnd_debug("CREATE CQ, num db pfns is %d\n", num_db_pfn); > + > + pkt = kzalloc(cq_pkt_size, GFP_KERNEL); > + > + if (!pkt) > + return -ENOMEM; > + > + hvnd_init_hdr(&pkt->hdr, > + (cq_pkt_size - > + sizeof(struct ndv_packet_hdr_control_1)), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_CQ_CREATE, > + ext_data_sz, > + ext_data_offset, > + 0); > + > + /* > + * Now fill out the ioctl section. > + */ > + > + pkt->ioctl.input.version = ND_VERSION_1; > + pkt->ioctl.input.queue_depth = cq->entries; > + pkt->ioctl.input.ce_mapping_cnt = MLX4_IB_CREATE_CQ_MAPPING_MAX; > + pkt->ioctl.input.cb_mapping_offset = sizeof(union create_cq_ioctl); > + > + hvnd_debug("ce_mapping cnt is %d\n", pkt->ioctl.input.ce_mapping_cnt); > + hvnd_debug("cb_mapping offset is %d\n", > + pkt->ioctl.input.cb_mapping_offset); > + > + pkt->ioctl.input.adapter_handle = uctx->adaptor_hdl; > + pkt->ioctl.input.affinity.mask = 0; > + pkt->ioctl.input.affinity.group = -1; > + > + // 0 for usermode CQ arming > + pkt->mappings.cq_in.flags = 0; > + > + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_BUF].map_memory.map_type = ND_MAP_MEMORY; > + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_BUF].map_memory.access_type = ND_MODIFY_ACCESS; > + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_BUF].map_memory.address = (u64)cq->cq_buf; > + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_BUF].map_memory.cb_length = (cq->entries * uctx->cqe_size); > + > + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_DB].map_memory.map_type = ND_MAP_MEMORY_COALLESCE; > + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_DB].map_memory.access_type = ND_WRITE_ACCESS; > + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_DB].map_memory.address = (u64)cq->db_addr; > + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_DB].map_memory.cb_length = 8; //size of two ints > + > + > + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_ARM_SN].map_memory.map_type = ND_MAP_MEMORY; > + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_ARM_SN].map_memory.access_type = ND_MODIFY_ACCESS; > + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_ARM_SN].map_memory.address = (u64)&cq->arm_sn; > + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_ARM_SN].map_memory.cb_length = 4; //size of one int > + /* > + * Fill in the extended data. > + */ > + > + pkt->ext_data.cnt = 3; > + pkt->ext_data.fields[MLX4_IB_CREATE_CQ_BUF].size = (sizeof(struct gpa_range) + (num_pfn * sizeof(u64))); > + pkt->ext_data.fields[MLX4_IB_CREATE_CQ_BUF].offset = offsetof(struct create_cq_ext_data, cqbuf_gpa); > + > + pkt->ext_data.fields[MLX4_IB_CREATE_CQ_DB].size = sizeof(struct cq_db_gpa); > + pkt->ext_data.fields[MLX4_IB_CREATE_CQ_DB].offset = offsetof(struct create_cq_ext_data, db_gpa); > + > + pkt->ext_data.fields[MLX4_IB_CREATE_CQ_ARM_SN].size = sizeof(struct cq_db_gpa); > + pkt->ext_data.fields[MLX4_IB_CREATE_CQ_ARM_SN].offset = offsetof(struct create_cq_ext_data, sn_gpa); > + > + /* > + * Fill up the gpa range for cq buffer. > + */ > + > + pkt->ext_data.db_gpa.byte_count = 8; > + pkt->ext_data.db_gpa.byte_offset = offset_in_page(cq->db_addr); > + user_va_init_pfn(&pkt->ext_data.db_gpa.pfn_array[0], cq->db_umem); > + > + pkt->ext_data.sn_gpa.byte_count = 4; > + pkt->ext_data.sn_gpa.byte_offset = offset_in_page(&cq->arm_sn); > + init_pfn(&pkt->ext_data.sn_gpa.pfn_array[0], > + &cq->arm_sn, > + 4); > + > + pkt->ext_data.cqbuf_gpa.byte_count = (cq->entries * uctx->cqe_size); > + pkt->ext_data.cqbuf_gpa.byte_offset = offset_in_page(cq->cq_buf); > + user_va_init_pfn(&pkt->ext_data.cqbuf_gpa.pfn_array[0], cq->umem); > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt->hdr, cq_pkt_size, (u64)pkt); > + > + if (ret) > + goto cr_cq_err; > + > + /* > + * Copy the necessary response from the host. > + */ > + cq->cqn = pkt->mappings.cq_resp.cqn; > + cq->cqe = pkt->mappings.cq_resp.cqe; > + cq->cq_handle = pkt->ioctl.resrc_desc.handle; > + > + ret = insert_handle(nd_dev, &nd_dev->cqidr, cq, cq->cqn); > + > + if (ret) > + goto cr_cq_err; > + hvnd_debug("CQ create after success cqn is %d\n", cq->cqn); > + hvnd_debug("CQ create after success cqe is %d\n", cq->cqe); > + hvnd_debug("CQ create after success cq handle is %p\n", > + (void *)cq->cq_handle); > + > +cr_cq_err: > + kfree(pkt); > + return ret; > +} > + > +int hvnd_destroy_cq(struct hvnd_dev *nd_dev, struct hvnd_cq *cq) > +{ > + struct pkt_nd_free_cq free_cq_pkt; > + int ret; > + > + // KYS try to avoid having to zero everything > + memset(&free_cq_pkt, 0, sizeof(free_cq_pkt)); > + hvnd_init_hdr(&free_cq_pkt.hdr, > + sizeof(struct pkt_nd_free_cq) - > + sizeof(struct ndv_packet_hdr_control_1), > + cq->uctx->create_pkt.handle.local, > + cq->uctx->create_pkt.handle.remote, > + IOCTL_ND_CQ_FREE, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + free_cq_pkt.ioctl.in.version = ND_VERSION_1; > + free_cq_pkt.ioctl.in.handle = cq->cq_handle; > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &free_cq_pkt.hdr, > + sizeof(struct pkt_nd_free_cq), > + (u64)&free_cq_pkt); > + > + if (ret) > + goto free_cq_err; > + > + remove_handle(nd_dev, &nd_dev->cqidr, cq->cqn); > + > + return 0; > + > +free_cq_err: > + return ret; > +} > + > + > +int hvnd_notify_cq(struct hvnd_dev *nd_dev, struct hvnd_cq *cq, > + u32 notify_type, u64 irp_handle) > +{ > + struct pkt_nd_notify_cq notify_cq_pkt; > + int ret; > + union ndv_context_handle irp_fhandle; > + > + irp_fhandle.local = cq->ep_object.local_irp; > + > + // KYS try to avoid having to zero everything > + memset(¬ify_cq_pkt, 0, sizeof(notify_cq_pkt)); > + hvnd_init_hdr(¬ify_cq_pkt.hdr, > + sizeof(struct pkt_nd_notify_cq) - > + sizeof(struct ndv_packet_hdr_control_1), > + cq->uctx->create_pkt.handle.local, > + cq->uctx->create_pkt.handle.remote, > + IOCTL_ND_CQ_NOTIFY, 0, 0, irp_fhandle.val64); > + > + /* > + * Now fill in the ioctl section. > + */ > + notify_cq_pkt.ioctl.in.version = ND_VERSION_1; > + notify_cq_pkt.ioctl.in.cq_handle = cq->cq_handle; > + notify_cq_pkt.ioctl.in.type = notify_type; > + > + > + ret = hvnd_send_ioctl_pkt(nd_dev, ¬ify_cq_pkt.hdr, > + sizeof(struct pkt_nd_notify_cq), > + (u64)¬ify_cq_pkt); > + > + return ret; > +} > + > +/* > + * Memory region operations. > + */ > +int hvnd_cr_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 pd_handle, u64 *mr_handle) > +{ > + struct pkt_nd_create_mr pkt; > + int ret; > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_MR_CREATE, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.version = ND_VERSION_1; > + pkt.ioctl.in.handle = pd_handle; > + hvnd_debug("PD handle is %p\n", (void *)pd_handle); > + pkt.ioctl.in.reserved = 0; > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + /* > + * Copy the handle. > + */ > + hvnd_debug("mr handle is %p\n", (void *)pkt.ioctl.out); > + *mr_handle = pkt.ioctl.out; > + > + return 0; > + > +err: > + hvnd_error("create mr failed: %d\n", ret); > + return ret; > + > +} > + > +int hvnd_free_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 handle) > +{ > + return hvnd_free_handle(nd_dev, uctx, handle, IOCTL_ND_MR_FREE); > +} > + > +int hvnd_deregister_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 handle) > +{ > + struct pkt_nd_deregister_mr pkt; > + int ret; > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_MR_DEREGISTER, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.version = ND_VERSION_1; > + pkt.ioctl.in.handle = handle; > + pkt.ioctl.in.reserved = 0; > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + return 0; > + > +err: > + hvnd_error("de-register mr failed: %d\n", ret); > + return ret; > + > +} > + > +static inline u32 hvnd_convert_access(int acc) > +{ > + return (acc & IB_ACCESS_REMOTE_WRITE ? ND_MR_FLAG_ALLOW_REMOTE_WRITE : 0) | > + (acc & IB_ACCESS_REMOTE_READ ? ND_MR_FLAG_ALLOW_REMOTE_READ : 0) | > + (acc & IB_ACCESS_LOCAL_WRITE ? ND_MR_FLAG_ALLOW_LOCAL_WRITE : 0); > +} > + > + > +int hvnd_mr_register(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + struct hvnd_mr *mr) > +{ > + struct pkt_nd_register_mr pkt; > + int ret; > + struct hv_mpb_array *pb; > + struct vmbus_packet_mpb_array *tpb; > + int sz_leaf; > + int num_pgs; > + int i = 0; > + int ext_data_sz; > + u32 acc_flags; > + u32 desc_size; > + int pkt_type; > + > + /* > + * The user address is passed in via a two level structure. > + * An Array of struct hv_page_buffer will be used to describe > + * the user memory. The pages containing this array will be descibed > + * in another array of struct hv_page_buffer. We pass this seconed level > + * array to the host. > + */ > + > + hvnd_debug("ib_umem_page_count(mr->umem)=%d\n", > + ib_umem_page_count(mr->umem)); > + > + sz_leaf = ib_umem_page_count(mr->umem) * sizeof(u64) + > + sizeof(struct hv_mpb_array); > + > + pb = (struct hv_mpb_array *) __get_free_pages(GFP_KERNEL|__GFP_ZERO, > + get_order(sz_leaf)); > + > + if (pb == NULL) > + return -ENOMEM; > + /* > + * Allocate an array of hv_page_buffer to describe the first level. > + */ > + num_pgs = DIV_ROUND_UP(sz_leaf, PAGE_SIZE); > + hvnd_debug("num pages in the top array is %d\n", num_pgs); > + > + desc_size = (num_pgs * sizeof(u64) + > + sizeof(struct vmbus_packet_mpb_array)); > + tpb = (struct vmbus_packet_mpb_array *) > + __get_free_pages(GFP_KERNEL|__GFP_ZERO, get_order(desc_size)); > + > + if (tpb == NULL) { > + free_pages((unsigned long)pb, get_order(sz_leaf)); > + return -ENOMEM; > + } > + > + hvnd_debug("sz leaf: %d; pgs in top %d\n", sz_leaf, num_pgs); > + > + /* > + * Now fill the leaf level array. > + */ > + pb->len = mr->length; > + pb->offset = offset_in_page(mr->start); > + user_va_init_pfn(pb->pfn_array, mr->umem); > + > + /* > + * Now fill out the top level array. > + */ > + for (i = 0; i < num_pgs; i++) { > + tpb->range.pfn_array[i] = virt_to_phys( > + (u8 *)pb + (PAGE_SIZE * i)) >> PAGE_SHIFT; > + hvnd_debug("virtual address = %p\n", > + (u8 *)pb + (PAGE_SIZE * i)); > + hvnd_debug("physical address = %llx\n", > + virt_to_phys((u8 *)pb + (PAGE_SIZE * i))); > + hvnd_debug("tpb->range.pfn_array[%d]=%llx\n", i, > + tpb->range.pfn_array[i]); > + } > + > + tpb->range.offset = 8; > + tpb->range.len = ib_umem_page_count(mr->umem) * sizeof(u64); > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + ext_data_sz = (ib_umem_page_count(mr->umem) * sizeof(u64)); > + acc_flags = ND_MR_FLAG_DO_NOT_SECURE_VM | hvnd_convert_access(mr->acc); > + hvnd_debug("memory register access flags are: %x\n", acc_flags); > + > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_MR_REGISTER, 0, 0, 0); > + > + /* > + * The memory registration call uses a different mechanism to pass > + * pfn information. > + */ > + > + pkt_type = pkt.hdr.pkt_hdr.packet_type; > + NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTION_EXTERNAL_DATA); > + pkt.hdr.pkt_hdr.packet_type = pkt_type; > + > + pkt.hdr.extended_data.size = ext_data_sz; > + pkt.hdr.extended_data.offset = 0; > + /* > + * Now fill out the ioctl. > + */ > + > + pkt.ioctl.in.header.version = ND_VERSION_1; > + pkt.ioctl.in.header.flags = acc_flags; > + pkt.ioctl.in.header.cb_length = mr->length; > + pkt.ioctl.in.header.target_addr = mr->virt; > + pkt.ioctl.in.header.mr_handle = mr->mr_handle; > + pkt.ioctl.in.address = mr->virt; > + > + /* > + * Now send the packet to the host. > + */ > + > + ret = hvnd_send_pgbuf_ioctl_pkt(nd_dev, > + tpb, desc_size, > + &pkt.hdr, > + sizeof(pkt), > + (unsigned long)&pkt); > + > + if (ret) > + goto err; > + > + hvnd_info("MR REGISTRATION SUCCESS\n"); > + /* > + * Copy the mr registration data. > + */ > + hvnd_debug("mr registration lkey %x\n", pkt.ioctl.out.lkey); > + hvnd_debug("mr registration rkey %x\n", pkt.ioctl.out.rkey); > + > + mr->mr_lkey = pkt.ioctl.out.lkey; > + mr->mr_rkey = pkt.ioctl.out.rkey; > + > + mr->ibmr.lkey = mr->mr_lkey; > + mr->ibmr.rkey = be32_to_cpu(mr->mr_rkey); > + hvnd_debug("ibmr registration lkey %x\n", mr->ibmr.lkey); > + hvnd_debug("ibmr registration rkey %x\n", mr->ibmr.rkey); > + > + free_pages((unsigned long)pb, get_order(sz_leaf)); > + free_pages((unsigned long)tpb, get_order(desc_size)); > + > + return 0; > + > +err: > + free_pages((unsigned long)pb, get_order(sz_leaf)); > + free_pages((unsigned long)tpb, get_order(desc_size)); > + > + hvnd_error("mr register failed: %d\n", ret); > + return ret; > +} > + > +/* > + * Listener operations. > + */ > +int hvnd_cr_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 *listener_handle) > +{ > + struct pkt_nd_cr_listener pkt; > + int ret; > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_LISTENER_CREATE, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.hdr.version = ND_VERSION_1; > + pkt.ioctl.in.hdr.handle = uctx->adaptor_hdl; > + hvnd_debug("Adaptor handle is %p\n", (void *)uctx->adaptor_hdl); > + pkt.ioctl.in.hdr.reserved = 0; > + pkt.ioctl.in.to_semantics = false; > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + /* > + * Copy the listener handle. > + */ > + hvnd_debug("listener handle is %p\n", (void *)pkt.ioctl.out); > + *listener_handle = pkt.ioctl.out; > + > + return 0; > + > +err: > + hvnd_error("create listener failed: ret=%d uctx=%p adaptor handle=%llu\n", > + ret, uctx, uctx->adaptor_hdl); > + return ret; > + > +} > + > +int hvnd_free_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 listener_handle) > +{ > + struct pkt_nd_free_listener pkt; > + int ret; > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_LISTENER_FREE, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.version = ND_VERSION_1; > + pkt.ioctl.in.handle = listener_handle; > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + return 0; > + > +err: > + hvnd_error("free listener failed: %d\n", ret); > + return ret; > +} > + > +int hvnd_bind_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 listener_handle, union nd_sockaddr_inet *addr) > +{ > + struct pkt_nd_bind_listener pkt; > + kuid_t uid = current_uid(); > + int ret; > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_LISTENER_BIND, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.hdr.version = ND_VERSION_1; > + pkt.ioctl.in.hdr.handle = listener_handle; > + pkt.ioctl.in.hdr.reserved = 0; > + > + pkt.ioctl.in.authentication_id = (u32)uid.val; > + pkt.ioctl.in.is_admin = false; > + > + memcpy(&pkt.ioctl.in.hdr.address, addr, sizeof(*addr)); > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + return 0; > + > +err: > + hvnd_error("bind listener failed: %d\n", ret); > + return ret; > +} > + > +int hvnd_listen_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 listener_handle, u32 backlog) > +{ > + struct pkt_nd_listen_listener pkt; > + int ret; > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_LISTENER_LISTEN, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.version = ND_VERSION_1; > + pkt.ioctl.in.listener_handle = listener_handle; > + pkt.ioctl.in.back_log = backlog; > + > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + return 0; > + > +err: > + hvnd_error("listen listener failed: %d\n", ret); > + return ret; > +} > + > +int hvnd_get_addr_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 listener_handle, union nd_sockaddr_inet *addr) > +{ > + struct pkt_nd_get_addr_listener pkt; > + int ret; > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_LISTENER_GET_ADDRESS, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.version = ND_VERSION_1; > + pkt.ioctl.in.handle = listener_handle; > + > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + /* > + * Copy the adddress. > + */ > + > + memcpy(addr, &pkt.ioctl.out, sizeof(union nd_sockaddr_inet)); > + > + return 0; > + > +err: > + hvnd_error("listen listener failed: %d\n", ret); > + return ret; > +} > + > +int hvnd_get_connection_listener(struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx, > + u64 listener_handle, u64 connector_handle, > + u64 irp_handle) > +{ > + struct pkt_nd_get_connection_listener pkt; > + int ret; > + union ndv_context_handle irp_fhandle; > + > + ret = get_irp_handle(nd_dev, &irp_fhandle.local, (void *)irp_handle); > + > + if (ret) { > + hvnd_error("get_irp_handle() failed: err: %d\n", ret); > + return ret; > + } > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST, 0, 0, > + irp_fhandle.val64); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.version = ND_VERSION_1; > + pkt.ioctl.in.listener_handle = listener_handle; > + pkt.ioctl.in.connector_handle = connector_handle; > + > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + return 0; > + > +err: > + hvnd_error("get connection listener failed: %d\n", ret); > + return ret; > +} > + > +/* > + * Connector APIs. > + */ > + > +int hvnd_cr_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 *connector_handle) > +{ > + struct pkt_nd_cr_connector pkt; > + int ret; > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(struct pkt_nd_cr_listener) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_CONNECTOR_CREATE, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.hdr.version = ND_VERSION_1; > + pkt.ioctl.in.hdr.handle = uctx->adaptor_hdl; > + pkt.ioctl.in.to_semantics = false; > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + /* > + * Copy the listener handle. > + */ > + hvnd_debug("connector handle is %p\n", (void *)pkt.ioctl.out); > + *connector_handle = pkt.ioctl.out; > + > + return 0; > + > +err: > + return ret; > +} > + > +int hvnd_free_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 handle) > +{ > + struct pkt_nd_free_connector pkt; > + int ret; > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_CONNECTOR_FREE, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.version = ND_VERSION_1; > + pkt.ioctl.in.handle = handle; > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + return 0; > + > +err: > + return ret; > +} > + > +int hvnd_bind_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 handle, union nd_sockaddr_inet *addr) > +{ > + struct pkt_nd_bind_connector pkt; > + int ret; > + kuid_t uid = current_uid(); > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_CONNECTOR_BIND, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.hdr.version = ND_VERSION_1; > + pkt.ioctl.in.hdr.handle = handle; > + > + memcpy(&pkt.ioctl.in.hdr.address, addr, sizeof(*addr)); > + > + pkt.ioctl.in.authentication_id = (u32)uid.val; > + pkt.ioctl.in.is_admin = false; > + > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + return 0; > + > +err: > + return ret; > +} > + > +int hvnd_connector_connect(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 connector_handle, u32 in_rd_limit, u32 out_rd_limit, > + u32 priv_data_length, const u8 *priv_data, > + u64 qp_handle, struct if_physical_addr *phys_addr, > + union nd_sockaddr_inet *dest_addr, > + struct hvnd_ep_obj *ep) > +{ > + struct pkt_nd_connector_connect *pkt = &ep->connector_connect_pkt; > + int ret; > + union ndv_context_handle irp_fhandle; > + > + hvnd_debug("local irp is %d\n", ep->local_irp); > + irp_fhandle.local = ep->local_irp; > + > + if (priv_data_length > MAX_PRIVATE_DATA_LEN) { > + hvnd_error("priv_data_length=%d\n", priv_data_length); > + return -EINVAL; > + } > + > + // KYS try to avoid having to zero everything > + memset(pkt, 0, sizeof(*pkt)); > + hvnd_init_hdr(&pkt->hdr, > + sizeof(*pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_CONNECTOR_CONNECT, 0, 0, irp_fhandle.val64); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt->ioctl.in.hdr.version = ND_VERSION_1; > + pkt->ioctl.in.hdr.connector_handle = connector_handle; > + pkt->ioctl.in.hdr.read_limits.inbound = in_rd_limit; > + pkt->ioctl.in.hdr.read_limits.outbound = out_rd_limit; > + pkt->ioctl.in.hdr.cb_private_data_length = priv_data_length; > + pkt->ioctl.in.hdr.cb_private_data_offset = > + offsetof(union connector_connect_ioctl, in.priv_data); > + pkt->ioctl.in.hdr.qp_handle = qp_handle; > + > + memcpy(&pkt->ioctl.in.hdr.phys_addr, phys_addr, > + sizeof(struct if_physical_addr)); > + > + /* > + * Luke's code does not copy the ip address. > + */ > + memcpy(&pkt->ioctl.in.hdr.destination_address, dest_addr, > + sizeof(union nd_sockaddr_inet)); > + > + pkt->ioctl.in.retry_cnt = 7; > + pkt->ioctl.in.rnr_retry_cnt = 7; > + memcpy(pkt->ioctl.in.priv_data, priv_data, priv_data_length); > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt->hdr, sizeof(*pkt), (u64)pkt); > + > + if (ret) > + goto err; > + > + return 0; > + > +err: > + return ret; > +} > + > +int hvnd_connector_complete_connect(struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx, > + u64 connector_handle, > + enum ibv_qp_state *qp_state) > +{ > + struct pkt_nd_connector_connect_complete pkt; > + int ret; > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_CONNECTOR_COMPLETE_CONNECT, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.hdr.version = ND_VERSION_1; > + pkt.ioctl.in.hdr.handle = connector_handle; > + pkt.ioctl.in.rnr_nak_to = 0; > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + *qp_state = pkt.ioctl.out.state; > + return 0; > + > +err: > + return ret; > +} > + > +int hvnd_connector_accept(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 connector_handle, > + u64 qp_handle, > + u32 in_rd_limit, u32 out_rd_limit, > + u32 priv_data_length, const u8 *priv_data, > + enum ibv_qp_state *qp_state, struct hvnd_ep_obj *ep) > +{ > + struct pkt_nd_connector_accept pkt; > + int ret; > + union ndv_context_handle irp_fhandle; > + > + irp_fhandle.local = ep->local_irp; > + > + if (priv_data_length > MAX_PRIVATE_DATA_LEN) { > + hvnd_error("priv_data_length=%d\n", priv_data_length); > + return -EINVAL; > + } > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_CONNECTOR_ACCEPT, 0, 0, irp_fhandle.val64); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.hdr.version = ND_VERSION_1; > + pkt.ioctl.in.hdr.reserved = 0; > + pkt.ioctl.in.hdr.read_limits.inbound = in_rd_limit; > + pkt.ioctl.in.hdr.read_limits.outbound = out_rd_limit; > + pkt.ioctl.in.hdr.cb_private_data_length = priv_data_length; > + > + pkt.ioctl.in.hdr.cb_private_data_offset = > + offsetof(struct connector_accept_in, private_data); > + > + pkt.ioctl.in.hdr.connector_handle = connector_handle; > + pkt.ioctl.in.hdr.qp_handle = qp_handle; > + > + pkt.ioctl.in.rnr_nak_to = 0; > + pkt.ioctl.in.rnr_retry_cnt = 7; > + > + > + memcpy(pkt.ioctl.in.private_data, priv_data, priv_data_length); > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + *qp_state = pkt.ioctl.out.state; > + return 0; > + > +err: > + return ret; > +} > + > +int hvnd_connector_reject(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 connector_handle, > + u32 priv_data_length, u8 *priv_data, > + enum ibv_qp_state *qp_state) > +{ > + struct pkt_nd_connector_reject pkt; > + int ret; > + > + if (priv_data_length > MAX_PRIVATE_DATA_LEN) { > + hvnd_error("priv_data_length=%d\n", priv_data_length); > + return -EINVAL; > + } > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_CONNECTOR_REJECT, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.hdr.version = ND_VERSION_1; > + pkt.ioctl.in.hdr.reserved = 0; > + pkt.ioctl.in.hdr.cb_private_data_length = priv_data_length; > + > + pkt.ioctl.in.hdr.cb_private_data_offset = > + offsetof(struct connector_reject_in, private_data); > + > + pkt.ioctl.in.hdr.connector_handle = connector_handle; > + > + memcpy(pkt.ioctl.in.private_data, priv_data, priv_data_length); > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + *qp_state = pkt.ioctl.out.state; > + return 0; > + > +err: > + return ret; > +} > + > +int hvnd_connector_get_rd_limits(struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx, > + u64 connector_handle, > + struct nd_read_limits *rd_limits) > +{ > + struct pkt_nd_connector_get_rd_limits pkt; > + int ret; > + > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_CONNECTOR_GET_READ_LIMITS, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.in.version = ND_VERSION_1; > + pkt.ioctl.in.in.reserved = 0; > + pkt.ioctl.in.in.handle = connector_handle; > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + *rd_limits = pkt.ioctl.out.out; > + return 0; > + > +err: > + return ret; > +} > + > +int hvnd_connector_get_priv_data(struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx, > + u64 connector_handle, > + u8 *priv_data) > +{ > + struct pkt_nd_connector_get_priv_data pkt; > + int ret; > + > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.version = ND_VERSION_1; > + pkt.ioctl.in.reserved = 0; > + pkt.ioctl.in.handle = connector_handle; > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + memcpy(priv_data, pkt.ioctl.out, MAX_PRIVATE_DATA_LEN); > + return 0; > + > +err: > + return ret; > +} > + > +int hvnd_connector_get_peer_addr(struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx, > + u64 connector_handle, > + union nd_sockaddr_inet *peer_addr) > +{ > + struct pkt_nd_connector_get_peer_addr pkt; > + int ret; > + > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.version = ND_VERSION_1; > + pkt.ioctl.in.reserved = 0; > + pkt.ioctl.in.handle = connector_handle; > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + memcpy(peer_addr, &pkt.ioctl.out, sizeof(union nd_sockaddr_inet)); > + return 0; > + > +err: > + return ret; > +} > + > +int hvnd_connector_get_local_addr(struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx, > + u64 connector_handle, > + union nd_sockaddr_inet *addr) > +{ > + struct pkt_nd_connector_get_addr pkt; > + int ret; > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_CONNECTOR_GET_ADDRESS, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.version = ND_VERSION_1; > + pkt.ioctl.in.reserved = 0; > + pkt.ioctl.in.handle = connector_handle; > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + memcpy(addr, &pkt.ioctl.out, sizeof(union nd_sockaddr_inet)); > + return 0; > + > +err: > + return ret; > +} > + > + > +int hvnd_connector_notify_disconnect(struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx, > + u64 connector_handle, struct hvnd_ep_obj *ep) > +{ > + struct pkt_nd_connector_notify_disconnect pkt; > + int ret; > + union ndv_context_handle irp_fhandle; > + > + irp_fhandle .local = ep->local_irp; > + > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT, 0, 0, > + irp_fhandle.val64); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.version = ND_VERSION_1; > + pkt.ioctl.in.reserved = 0; > + pkt.ioctl.in.handle = connector_handle; > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + return 0; > + > +err: > + return ret; > +} > + > + > +//ASYNCH call > +int hvnd_connector_disconnect(struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx, > + u64 connector_handle, struct hvnd_ep_obj *ep) > +{ > + struct pkt_nd_connector_disconnect pkt; > + int ret; > + union ndv_context_handle irp_fhandle; > + > + irp_fhandle.local = ep->local_irp; > + > + > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_CONNECTOR_DISCONNECT, 0, 0, irp_fhandle.val64); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.version = ND_VERSION_1; > + pkt.ioctl.in.reserved = 0; > + pkt.ioctl.in.handle = connector_handle; > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + return 0; > + > +err: > + return ret; > +} > + > +/* > + * QP operations. > + */ > +int hvnd_create_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + struct hvnd_qp *qp) > +{ > + int ret; > + struct pkt_nd_create_qp *pkt; > + int num_pfn, num_db_pfn; > + int qp_pkt_size; > + unsigned int offset; > + u32 ext_data_offset; > + u32 ext_data_size; > + > + /* > + * Now create QP. > + * First compute the number of PFNs we need to accomodate: > + * One each for door bell and arm_sn and pages in cq buffer. > + */ > + offset = offset_in_page(qp->qp_buf); > + num_pfn = DIV_ROUND_UP(offset + qp->buf_size, PAGE_SIZE); > + > + offset = offset_in_page(qp->db_addr); > + num_db_pfn = DIV_ROUND_UP(offset + 4, PAGE_SIZE); > + > + qp_pkt_size = sizeof(struct pkt_nd_create_qp) + > + (num_pfn * sizeof(u64)); > + > + hvnd_debug("CREATE QP, num pfns is %d\n", num_pfn); > + hvnd_debug("CREATE QP, num DB pfns is %d\n", num_db_pfn); > + > + pkt = kzalloc(qp_pkt_size, GFP_KERNEL); > + > + if (!pkt) > + return -ENOMEM; > + > + hvnd_debug("offset of nd_create_qp is %d\n", > + (int)offsetof(struct pkt_nd_create_qp, ioctl.input)); > + > + ext_data_offset = offsetof(struct pkt_nd_create_qp, ext_data) - > + sizeof(struct ndv_packet_hdr_control_1); > + > + ext_data_size = sizeof(struct create_qp_ext_data) + > + (num_pfn * sizeof(u64)); > + > + hvnd_init_hdr(&pkt->hdr, > + qp_pkt_size - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_QP_CREATE, > + ext_data_size, > + ext_data_offset, > + 0); > + > + /* > + * Now fill out the ioctl section. > + */ > + > + pkt->ioctl.input.hdr.version = ND_VERSION_1; > + > + if (qp->max_inline_data > nd_dev->query_pkt.ioctl.ad_info.inline_request_threshold) > + qp->max_inline_data = nd_dev->query_pkt.ioctl.ad_info.inline_request_threshold; > + pkt->ioctl.input.hdr.cb_max_inline_data = qp->max_inline_data; > + > + hvnd_debug("pkt->ioctl.input.hdr.cb_max_inline_data=%d\n", > + pkt->ioctl.input.hdr.cb_max_inline_data); > + > + pkt->ioctl.input.hdr.ce_mapping_cnt = MLX4_IB_CREATE_QP_MAPPINGS_MAX; > + pkt->ioctl.input.hdr.cb_mapping_offset = sizeof(union create_qp_ioctl); > + > + pkt->ioctl.input.hdr.initiator_queue_depth = qp->initiator_q_depth; > + pkt->ioctl.input.hdr.max_initiator_request_sge = > + qp->initiator_request_sge; > + > + hvnd_debug("recv cq handle is %p\n", (void *)qp->receive_cq_handle); > + hvnd_debug("send cq handle is %p\n", (void *)qp->initiator_cq_handle); > + hvnd_debug("pd handle is %p\n", (void *)qp->pd_handle); > + pkt->ioctl.input.hdr.receive_cq_handle = qp->receive_cq_handle; > + pkt->ioctl.input.hdr.initiator_cq_handle = qp->initiator_cq_handle; > + pkt->ioctl.input.hdr.pd_handle = qp->pd_handle; > + > + > + hvnd_debug("ce_mapping cnt is %d\n", > + pkt->ioctl.input.hdr.ce_mapping_cnt); > + hvnd_debug("cb_mapping offset is %d\n", > + pkt->ioctl.input.hdr.cb_mapping_offset); > + > + pkt->ioctl.input.receive_queue_depth = qp->receive_q_depth; > + pkt->ioctl.input.max_receive_request_sge = qp->receive_request_sge; > + > + > + pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_BUF].map_memory.map_type = ND_MAP_MEMORY; > + pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_BUF].map_memory.access_type = ND_MODIFY_ACCESS; > + pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_BUF].map_memory.address = (u64)qp->qp_buf; > + pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_BUF].map_memory.cb_length = qp->buf_size; > + > + pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_DB].map_memory.map_type = ND_MAP_MEMORY_COALLESCE; > + pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_DB].map_memory.access_type = ND_WRITE_ACCESS; > + pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_DB].map_memory.address = (u64)qp->db_addr; > + pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_DB].map_memory.cb_length = 4; > + > + pkt->mappings.qp_in.log_sq_bb_count = qp->log_sq_bb_count; > + pkt->mappings.qp_in.log_sq_stride = qp->log_sq_stride; > + pkt->mappings.qp_in.sq_no_prefetch = qp->sq_no_prefetch; > + > + > + /* > + * Fill in the extended data. > + */ > + > + pkt->ext_data.cnt = 2; > + pkt->ext_data.fields[MLX4_IB_CREATE_QP_BUF].size = > + sizeof(struct gpa_range) + (num_pfn * sizeof(u64)); > + pkt->ext_data.fields[MLX4_IB_CREATE_QP_BUF].offset = > + offsetof(struct create_qp_ext_data, qpbuf_gpa); > + > + pkt->ext_data.fields[MLX4_IB_CREATE_QP_DB].size = > + sizeof(struct qp_db_gpa); > + pkt->ext_data.fields[MLX4_IB_CREATE_QP_DB].offset = > + offsetof(struct create_qp_ext_data, db_gpa); > + > + /* > + * Fill up the gpa range for qp buffer. > + */ > + > + pkt->ext_data.db_gpa.byte_count = 4; // KYS 8 or 16? > + pkt->ext_data.db_gpa.byte_offset = offset_in_page(qp->db_addr); > + user_va_init_pfn(&pkt->ext_data.db_gpa.pfn_array[0], qp->db_umem); > + > + pkt->ext_data.qpbuf_gpa.byte_count = qp->buf_size; > + pkt->ext_data.qpbuf_gpa.byte_offset = offset_in_page(qp->qp_buf); > + user_va_init_pfn(&pkt->ext_data.qpbuf_gpa.pfn_array[0], qp->umem); > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt->hdr, qp_pkt_size, (u64)pkt); > + > + if (ret) > + goto cr_qp_err; > + > + /* > + * Copy the necessary response from the host. > + */ > + qp->qp_handle = pkt->ioctl.resrc_desc.handle; > + > + qp->qpn = pkt->mappings.qp_resp.qpn; > + qp->max_send_wr = pkt->mappings.qp_resp.max_send_wr; > + qp->max_recv_wr = pkt->mappings.qp_resp.max_recv_wr; > + qp->max_send_sge = pkt->mappings.qp_resp.max_send_sge; > + qp->max_recv_sge = pkt->mappings.qp_resp.max_recv_sge; > + > + > + hvnd_debug("qp->max_send_wr=%d max_recv_wr=%d max_send_sge=%d max_recv_sge=%d max_inline_data=%d\n", > + qp->max_send_wr, qp->max_recv_wr, qp->max_send_sge, > + qp->max_recv_sge, qp->max_inline_data); > + > + ret = insert_handle(nd_dev, &nd_dev->qpidr, qp, qp->qpn); > + > + if (ret) > + goto cr_qp_err; > + > + hvnd_debug("QP create after success qpn:%d qp:%p handle:%llu\n", > + qp->qpn, qp, qp->qp_handle); > + > +cr_qp_err: > + kfree(pkt); > + return ret; > +} > + > +int hvnd_free_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + struct hvnd_qp *qp) > +{ > + int ret; > + > + ret = hvnd_free_handle(nd_dev, uctx, qp->qp_handle, IOCTL_ND_QP_FREE); > + if (ret == 0) > + remove_handle(nd_dev, &nd_dev->qpidr, qp->qpn); > + return ret; > +} > + > +int hvnd_flush_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + struct hvnd_qp *qp) > +{ > + struct pkt_nd_flush_qp pkt; > + int ret; > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_QP_FLUSH, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.version = ND_VERSION_1; > + pkt.ioctl.in.reserved = 0; > + pkt.ioctl.in.handle = qp->qp_handle; > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + return 0; > + > +err: > + return ret; > +} > + > + > +int hvnd_bind_nic(struct hvnd_dev *nd_dev, bool un_bind, char *ip_addr, > + char *mac_addr) > +{ > + int ret; > + int pkt_type = NDV_PKT_ID1_BIND; > + > + /* > + * Send the bind information over to the host. > + * For now, we will have a single ip and MAC address that we > + * will deal with. Down the road we will need to expand support > + * for multiple IP and MAC addresses and also deal with changing > + * IP addresses. > + */ > + > + NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); > + hvnd_debug("bind packet type is %d ID:%d\n", > + pkt_type, NDV_PACKET_TYPE_ID(pkt_type)); > + nd_dev->bind_pkt.pkt_hdr.packet_type = pkt_type; > + > + nd_dev->bind_pkt.pkt_hdr.hdr_sz = sizeof(struct ndv_pkt_hdr_bind_1); > + hvnd_debug("bind packet size is %d\n", > + (int)sizeof(struct ndv_pkt_hdr_bind_1)); > + nd_dev->bind_pkt.pkt_hdr.data_sz = 0; > + nd_dev->bind_pkt.unbind = un_bind; > + nd_dev->bind_pkt.ip_address.address_family = AF_INET; > + nd_dev->bind_pkt.ip_address.ipv4.sin_family = AF_INET; > + nd_dev->bind_pkt.ip_address.ipv4.sin_port = 0; > + nd_dev->bind_pkt.ip_address.ipv4.sin_addr.s_addr = *(unsigned int *)ip_addr; > + > + nd_dev->bind_pkt.phys_addr.length = ETH_ALEN; > + memcpy(nd_dev->bind_pkt.phys_addr.addr, mac_addr, ETH_ALEN); > + > + /* > + * This is the adapter handle; needs to be unique for each > + * MAC, ip address tuple. > + */ > + nd_dev->bind_pkt.guest_id = (u64)nd_dev; > + > + ret = hvnd_send_packet(nd_dev, &nd_dev->bind_pkt, > + sizeof(struct ndv_pkt_hdr_bind_1), > + (u64)NULL, > + true); > + return ret; > +} > + > +int hvnd_init_resources(struct hvnd_dev *nd_dev) > +{ > + unsigned long mmio_sz; > + struct resource *resrc; > + int ret = -ENOMEM; > + > + resrc = &iomem_resource; > + > + mmio_sz = nd_dev->hvdev->channel->offermsg.offer.mmio_megabytes * > + 1024 * 1024; > + nd_dev->mmio_sz = mmio_sz; > + nd_dev->mmio_resource.name = KBUILD_MODNAME; > + nd_dev->mmio_resource.flags = IORESOURCE_MEM | IORESOURCE_BUSY; > + > + ret = allocate_resource(resrc, &nd_dev->mmio_resource, > + mmio_sz, 0, -1, mmio_sz, NULL, NULL); > + > + if (ret) { > + hvnd_error("Unable to allocate mmio resources\n"); > + return ret; > + } > + hvnd_debug("MMIO start is %p\n", (void *)nd_dev->mmio_resource.start); > + > + /* > + * Send the mmio information over to the host. > + */ > + nd_dev->resources.pkt_hdr.packet_type = NDV_PKT_ID1_INIT_RESOURCES; > + nd_dev->resources.pkt_hdr.hdr_sz = sizeof(union ndv_packet_hdr); > + nd_dev->resources.pkt_hdr.data_sz = 0; > + > + nd_dev->resources.io_space_sz_mb = mmio_sz; > + nd_dev->resources.io_space_start = nd_dev->mmio_resource.start; > + > + ret = hvnd_send_packet(nd_dev, &nd_dev->resources, > + sizeof(struct ndv_pkt_hdr_init_resources_1), > + (u64)NULL, > + true); > + return ret; > +} > + > +int hvnd_query_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx) > +{ > + struct pkt_nd_query_adaptor *pkt; > + int ret; > + int pkt_type; > + struct ib_device_attr *props = &nd_dev->ibdev.attrs; > + struct adapter_info_v2 *adap_info = &nd_dev->query_pkt.ioctl.ad_info; > + > + hvnd_debug("Performing Adapter query nd_dev=%p\n", nd_dev); > + > + // check if there is a need to do query > + if (nd_dev->query_pkt_set) > + return 0; > + > + // need a lock, multiple process can call this at the same time > + > + down(&nd_dev->query_pkt_sem); > + if (nd_dev->query_pkt_set) { > + up(&nd_dev->query_pkt_sem); > + return 0; > + } > + > + /* > + * Now query the adaptor. > + */ > + > + pkt = &nd_dev->query_pkt; > + > + pkt_type = NDV_PKT_ID1_CONTROL; > + NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); > + > + pkt->hdr.pkt_hdr.packet_type = pkt_type; > + pkt->hdr.pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); > + pkt->hdr.pkt_hdr.data_sz = sizeof(struct pkt_nd_query_adaptor) - > + sizeof(struct ndv_packet_hdr_control_1); > + > + > + pkt->hdr.file_handle.local = uctx->file_handle.local; > + pkt->hdr.file_handle.remote = uctx->file_handle.remote; > + > + pkt->hdr.irp_handle.val64 = 0; > + > + pkt->hdr.io_cntrl_code = IOCTL_ND_ADAPTER_QUERY; > + pkt->hdr.output_buf_sz = sizeof(struct nd_adap_query_ioctl); > + pkt->hdr.input_buf_sz = sizeof(struct nd_adap_query_ioctl); > + pkt->hdr.input_output_buf_offset = 0; > + memset(&pkt->ioctl.ad_q, 0, sizeof(struct nd_adap_query_ioctl)); > + > + pkt->ioctl.ad_q.version = ND_VERSION_1; > + pkt->ioctl.ad_q.info_version = ND_VERSION_2; > + pkt->ioctl.ad_q.adapter_handle = uctx->adaptor_hdl; > + > + ret = hvnd_send_packet(nd_dev, pkt, > + sizeof(struct pkt_nd_query_adaptor), > + (unsigned long)pkt, true); > + > + hvnd_debug("pkt->ioctl.ad_info.inline_request_threshold=%d\n", > + pkt->ioctl.ad_info.inline_request_threshold); > + nd_dev->query_pkt_set = true; > + > + // how about host returning PENDING > + up(&nd_dev->query_pkt_sem); > + > + if (ret) > + return ret; > + > + hvnd_debug("Query Adaptor Succeeded\n"); > + > + /* > + * Cache the relevant properties out. > + */ > + props->fw_ver = 0; > + props->device_cap_flags = 0; > + props->vendor_id = 0x15b3; > + props->vendor_part_id = adap_info->device_id; > + props->max_mr_size = ~0ull; > + props->page_size_cap = PAGE_SIZE; > + props->max_qp = 16384; > + props->max_qp_wr = min(adap_info->max_recv_q_depth, > + adap_info->max_initiator_q_depth); > + props->max_sge = min(adap_info->max_initiator_sge, > + adap_info->max_recv_sge); > + props->max_cq = 0x1FFFF; > + props->max_cqe = adap_info->max_completion_q_depth; > + props->max_mr = 16384; > + props->max_pd = 16384; > + props->max_qp_rd_atom = adap_info->max_inbound_read_limit; > + props->max_qp_init_rd_atom = adap_info->max_outbound_read_limit; > + props->max_res_rd_atom = props->max_qp_rd_atom * props->max_qp; > + props->max_srq = 16384; > + props->max_srq_wr = adap_info->max_recv_q_depth; > + props->max_srq_sge = adap_info->max_recv_sge; > + return 0; > +} > + > + > +int hvnd_create_pd(struct hvnd_ucontext *uctx, struct hvnd_dev *nd_dev, > + struct hvnd_ib_pd *hvnd_pd) > +{ > + struct pkt_nd_pd_create *pkt = &uctx->pd_cr_pkt; > + int ret; > + int pkt_type; > + > + hvnd_debug("Create Protection Domain\n"); > + > + pkt_type = NDV_PKT_ID1_CONTROL; > + NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); > + > + pkt->hdr.pkt_hdr.packet_type = pkt_type; > + pkt->hdr.pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); > + pkt->hdr.pkt_hdr.data_sz = sizeof(struct pkt_nd_pd_create) - > + sizeof(struct ndv_packet_hdr_control_1); > + > + hvnd_debug("pdcreate packet size: %d\n", > + (int)sizeof(struct pkt_nd_pd_create)); > + hvnd_debug("pdcreate hdr size: %d\n", > + (int)sizeof(struct ndv_packet_hdr_control_1)); > + hvnd_debug("pdcreate data size: %d\n", pkt->hdr.pkt_hdr.data_sz); > + > + pkt->hdr.file_handle.local = uctx->create_pkt.handle.local; > + pkt->hdr.file_handle.remote = uctx->create_pkt.handle.remote; > + > + hvnd_debug("create pd uctx is %p\n", uctx); > + hvnd_debug("create pd local file is %d\n", > + uctx->create_pkt.handle.local); > + hvnd_debug("create pd local file is %d\n", > + uctx->create_pkt.handle.remote); > + > + pkt->hdr.irp_handle.val64 = 0; > + pkt->hdr.io_cntrl_code = IOCTL_ND_PD_CREATE; > + > + pkt->hdr.output_buf_sz = sizeof(struct nd_create_pd_ioctl); > + pkt->hdr.input_buf_sz = sizeof(struct nd_create_pd_ioctl); > + pkt->hdr.input_output_buf_offset = 0; > + > + hvnd_debug("output/input buf size: %d\n", pkt->hdr.output_buf_sz); > + > + /* > + * Fill the ioctl section. > + */ > + pkt->ioctl.in.version = ND_VERSION_1; > + pkt->ioctl.in.reserved = 0; > + pkt->ioctl.in.handle = uctx->adaptor_hdl; > + > + > + ret = hvnd_send_packet(nd_dev, pkt, > + sizeof(struct pkt_nd_pd_create), > + (unsigned long)pkt, true); > + > + if (ret) > + return ret; > + > + if (pkt->hdr.pkt_hdr.status != 0) { > + hvnd_error("Create PD failed; status is %d\n", > + pkt->hdr.pkt_hdr.status); > + return -EINVAL; > + } > + if (pkt->hdr.io_status != 0) { > + hvnd_error("Create PD failed;io status is %d\n", > + pkt->hdr.io_status); > + return -EINVAL; > + } > + > + hvnd_debug("Create PD Succeeded\n"); > + > + hvnd_debug("pd_handle is %p\n", (void *)pkt->ioctl.resp.pd_handle); > + hvnd_debug("pdn is %d\n", (int)pkt->ioctl.resp.pdn); > + > + hvnd_pd->pdn = pkt->ioctl.resp.pdn; > + hvnd_pd->handle = pkt->ioctl.out_handle; > + > + return 0; > +} > + > +int hvnd_cancel_io(struct hvnd_ep_obj *ep_object) > +{ > + struct pkt_nd_cancel_io pkt; > + int ret; > + u32 ioctl; > + > + switch (ep_object->type) { > + case ND_LISTENER: > + hvnd_debug("LISTENER I/O Cancelled\n"); > + ioctl = IOCTL_ND_LISTENER_CANCEL_IO; > + break; > + case ND_CONNECTOR: > + hvnd_debug("CONNECTOR I/O Cancelled\n"); > + ioctl = IOCTL_ND_CONNECTOR_CANCEL_IO; > + break; > + case ND_MR: > + hvnd_debug("MR I/O Cancelled\n"); > + ioctl = IOCTL_ND_MR_CANCEL_IO; > + break; > + case ND_CQ: > + hvnd_debug("CQ I/O Cancelled\n"); > + ioctl = IOCTL_ND_CQ_CANCEL_IO; > + break; > + default: > + hvnd_error("UNKNOWN object type\n"); > + return -EINVAL; > + } > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + ep_object->uctx->create_pkt.handle.local, > + ep_object->uctx->create_pkt.handle.remote, > + ioctl, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.version = ND_VERSION_1; > + pkt.ioctl.in.reserved = 0; > + pkt.ioctl.in.handle = ep_object->ep_handle; > + hvnd_debug("cancel io handle is %p\n", (void *)ep_object->ep_handle); > + > + ret = hvnd_send_ioctl_pkt(ep_object->nd_dev, &pkt.hdr, > + sizeof(pkt), > + (u64)&pkt); > + > + if (ret) > + goto err; > + > + /* > + * Now that we have cancelled all I/Os, > + */ > + > + return 0; > + > +err: > + hvnd_error("cancel I/O operation failed\n"); > + return ret; > +} > + > + > +int hvnd_free_handle(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 handle, u32 ioctl) > +{ > + struct pkt_nd_free_handle pkt; > + int ret; > + > + hvnd_debug("Freeing handle ioctl is %s; handle is %p\n", > + hvnd_get_op_name(ioctl), (void *)handle); > + > + hvnd_debug("uctx is %p\n", uctx); > + hvnd_debug("nd_dev is %p\n", nd_dev); > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + ioctl, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.version = ND_VERSION_1; > + pkt.ioctl.in.reserved = 0; > + pkt.ioctl.in.handle = handle; > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + return 0; > + > +err: > + hvnd_error("%s: ret=%d\n", __func__, ret); > + return ret; > +} > + > +int hvnd_negotiate_version(struct hvnd_dev *nd_dev) > +{ > + union ndv_packet_init *pkt = &nd_dev->init_pkt; > + int ret; > + > + nd_dev->negotiated_version = NDV_PROTOCOL_VAERSION_INVALID; > + > + pkt->packet_type = NDV_PACKET_TYPE_INIT; > + pkt->protocol_version = NDV_PROTOCOL_VERSION_CURRENT; > + pkt->flags = 0; // KYS are the flags 0? > + > + ret = hvnd_send_packet(nd_dev, pkt, > + sizeof(union ndv_packet_init), (u64)NULL, true); > + > + return ret; > +} > + > +void hvnd_callback(void *context) > +{ > + struct hv_device *dev = context; > + struct hvnd_dev *nd_dev = hv_get_drvdata(dev); > + int copy_sz = 0; > + struct ndv_packet_hdr_control_1 *ctrl_hdr; > + union ndv_packet_init *pkt_init; > + u32 recvlen; > + u32 local_irp; > + u64 requestid; > + u32 *pkt_type; > + u32 pkt_id; > + struct hvnd_ep_obj *ep_object; > + struct incoming_pkt *incoming_pkt; /* Used only for asynch calls */ > + char *incoming_pkt_start; > + struct vmpacket_descriptor *desc; > + int status; > + struct hvnd_cookie *hvnd_cookie; > + unsigned long flags; > + > + vmbus_recvpacket_raw(dev->channel, hvnd_recv_buffer, > + (PAGE_SIZE * 4), &recvlen, &requestid); > + > + if (recvlen <= 0) > + return; > + > + desc = (struct vmpacket_descriptor *)hvnd_recv_buffer; > + incoming_pkt_start = hvnd_recv_buffer + (desc->offset8 << 3); > + recvlen -= desc->offset8 << 3; > + > + pkt_type = (u32 *)incoming_pkt_start; > + pkt_id = *pkt_type; > + if (pkt_id != NDV_PACKET_TYPE_INIT) > + pkt_id = NDV_PACKET_TYPE_ID(pkt_id); > + > + switch (pkt_id) { > + case NDV_PACKET_TYPE_INIT: > + /* > + * Host is responding to our init packet. > + */ > + pkt_init = (union ndv_packet_init *)incoming_pkt_start; > + nd_dev->negotiated_version = pkt_init->protocol_version; > + copy_sz = 0; > + break; > + > + case NDV_PKT_ID1_INIT_RESOURCES: > + copy_sz = 0; > + break; > + > + case NDV_PKT_ID1_BIND: > + nd_dev->bind_pkt.pkt_hdr.status = > + ((union ndv_packet_hdr *) incoming_pkt_start)->status; > + copy_sz = 0; > + break; > + > + case NDV_PKT_ID1_COMPLETE: > + ctrl_hdr = (struct ndv_packet_hdr_control_1 *)incoming_pkt_start; > + status = ctrl_hdr->io_status; > + > + local_irp = ctrl_hdr->irp_handle.local; > + ep_object = (struct hvnd_ep_obj *)map_irp_to_ctx(nd_dev, > + local_irp); > + > + if (!ep_object) { > + hvnd_error("irp could not be mapped; irp is %d ioctl is %s", > + local_irp, > + hvnd_get_op_name(ctrl_hdr->io_cntrl_code)); > + goto complete; > + } > + > + if (ctrl_hdr->io_cntrl_code != IOCTL_ND_CQ_NOTIFY) > + hvnd_debug("completion packet; iostatus is %x, ioctl is %s", > + ctrl_hdr->io_status, > + hvnd_get_op_name(ctrl_hdr->io_cntrl_code)); > + > + switch (ctrl_hdr->io_cntrl_code) { > + > + case IOCTL_ND_CQ_NOTIFY: > + hvnd_process_cq_event_complete(ep_object, status); > + > + ep_del_work_pending(ep_object); > + goto complete; > + > + case IOCTL_ND_CONNECTOR_ACCEPT: > + > + hvnd_process_connector_accept(ep_object, status); > + > + ep_del_work_pending(ep_object); > + goto complete; > + > + case IOCTL_ND_CONNECTOR_DISCONNECT: > + hvnd_debug("disconnected: ep opj is %p; status: %d\n", > + ep_object, status); > + hvnd_process_disconnect(ep_object, status); > + > + ep_del_work_pending(ep_object); > + goto complete; > + > + default: > + break; > + } > + > + /* > + * This is the completion notification; > + * the IRP cookie is the state through which > + * we will invoke the callback. > + */ > + incoming_pkt = kmalloc(recvlen + sizeof(struct incoming_pkt), > + GFP_ATOMIC); > + if (incoming_pkt == NULL) { > + hvnd_error("Could not alloc memory in callback\n"); > + ep_del_work_pending(ep_object); > + goto complete; > + } > + memcpy(incoming_pkt->pkt, incoming_pkt_start, recvlen); > + > + spin_lock_irqsave(&ep_object->incoming_pkt_list_lock, flags); > + list_add_tail(&incoming_pkt->list_entry, > + &ep_object->incoming_pkt_list); > + spin_unlock_irqrestore(&ep_object->incoming_pkt_list_lock, > + flags); > + > + schedule_work(&ep_object->wrk.work); > + > + goto complete; > + > + case NDV_PKT_ID1_CREATE: > + copy_sz = sizeof(struct ndv_pkt_hdr_create_1); > + break; > + > + case NDV_PKT_ID1_CLEANUP: > + copy_sz = sizeof(struct ndv_pkt_hdr_cleanup_1); > + break; > + > + case NDV_PKT_ID1_CONTROL: > + ctrl_hdr = (struct ndv_packet_hdr_control_1 *)incoming_pkt_start; > + status = ctrl_hdr->io_status; > + > + if (ctrl_hdr->io_cntrl_code != IOCTL_ND_CQ_NOTIFY) > + hvnd_debug("packet; iostatus is %x ioctl is %s", > + ctrl_hdr->io_status, hvnd_get_op_name(ctrl_hdr->io_cntrl_code)); > + > + switch (ctrl_hdr->io_cntrl_code) { > + > + case IOCTL_ND_PROVIDER_INIT: > + copy_sz = sizeof(struct pkt_nd_provider_ioctl); > + break; > + > + case IOCTL_ND_PROVIDER_BIND_FILE: > + copy_sz = sizeof(struct pkt_nd_provider_ioctl); > + break; > + > + case IOCTL_ND_ADAPTER_OPEN: > + copy_sz = sizeof(struct pkt_nd_open_adapter); > + break; > + > + case IOCTL_ND_ADAPTER_CLOSE: > + copy_sz = sizeof(struct pkt_nd_free_handle); > + break; > + > + case IOCTL_ND_ADAPTER_QUERY: > + copy_sz = sizeof(struct pkt_nd_query_adaptor); > + break; > + > + case IOCTL_ND_PD_CREATE: > + copy_sz = sizeof(struct pkt_nd_pd_create); > + break; > + > + case IOCTL_ND_PD_FREE: > + copy_sz = sizeof(struct pkt_nd_free_handle); > + break; > + > + case IOCTL_ND_CQ_CREATE: > + copy_sz = sizeof(struct pkt_nd_create_cq); > + break; > + > + case IOCTL_ND_CQ_FREE: > + copy_sz = sizeof(struct pkt_nd_free_cq); > + break; > + > + case IOCTL_ND_CQ_NOTIFY: //FIXME check ep stop state > + local_irp = ctrl_hdr->irp_handle.local; > + ep_object = (struct hvnd_ep_obj *) > + map_irp_to_ctx(nd_dev, local_irp); > + if (!ep_object) { > + hvnd_error("irp could not be mapped\n"); > + goto complete; > + return; > + } > + copy_sz = sizeof(struct pkt_nd_notify_cq); > + hvnd_process_cq_event_pending(ep_object, status); > + goto complete; > + return; > + > + case IOCTL_ND_LISTENER_CREATE: > + copy_sz = sizeof(struct pkt_nd_cr_listener); > + break; > + > + case IOCTL_ND_LISTENER_FREE: > + copy_sz = sizeof(struct pkt_nd_free_listener); > + break; > + > + case IOCTL_ND_QP_FREE: > + copy_sz = sizeof(struct pkt_nd_free_handle); > + break; > + > + case IOCTL_ND_CONNECTOR_CANCEL_IO: > + case IOCTL_ND_MR_CANCEL_IO: > + case IOCTL_ND_CQ_CANCEL_IO: > + case IOCTL_ND_LISTENER_CANCEL_IO: > + copy_sz = sizeof(struct pkt_nd_cancel_io); > + break; > + > + case IOCTL_ND_LISTENER_BIND: > + copy_sz = sizeof(struct pkt_nd_bind_listener); > + break; > + > + case IOCTL_ND_LISTENER_LISTEN: > + copy_sz = sizeof(struct pkt_nd_listen_listener); > + break; > + > + case IOCTL_ND_LISTENER_GET_ADDRESS: > + copy_sz = sizeof(struct pkt_nd_get_addr_listener); > + break; > + > + case IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST: > + copy_sz = sizeof(struct pkt_nd_get_connection_listener); > + goto complete; // non-block > + > + case IOCTL_ND_CONNECTOR_CREATE: > + copy_sz = sizeof(struct pkt_nd_cr_connector); > + break; > + > + case IOCTL_ND_CONNECTOR_FREE: > + copy_sz = sizeof(struct pkt_nd_free_connector); > + break; > + > + case IOCTL_ND_CONNECTOR_BIND: > + copy_sz = sizeof(struct pkt_nd_free_connector); > + break; > + > + // KYS: ALERT: ASYNCH Operation > + case IOCTL_ND_CONNECTOR_CONNECT: > + copy_sz = sizeof(struct pkt_nd_connector_connect); > + goto complete; //non-block > + > + case IOCTL_ND_CONNECTOR_COMPLETE_CONNECT: > + copy_sz = sizeof(struct pkt_nd_connector_connect_complete); > + break; > + > + // KYS: ALERT: ASYNCH Operation > + case IOCTL_ND_CONNECTOR_ACCEPT: > + copy_sz = sizeof(struct pkt_nd_connector_accept); > + goto complete; //non-block > + > + case IOCTL_ND_CONNECTOR_REJECT: > + copy_sz = sizeof(struct pkt_nd_connector_reject); > + break; > + > + case IOCTL_ND_CONNECTOR_GET_READ_LIMITS: > + copy_sz = sizeof(struct pkt_nd_connector_get_rd_limits); > + break; > + > + case IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA: > + copy_sz = sizeof(struct pkt_nd_connector_get_priv_data); > + break; > + > + case IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS: > + copy_sz = sizeof(struct pkt_nd_connector_get_peer_addr); > + break; > + > + case IOCTL_ND_CONNECTOR_GET_ADDRESS: > + copy_sz = sizeof(struct pkt_nd_connector_get_addr); > + break; > + > + // KYS: ALERT: ASYNCH Operation > + case IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT: > + copy_sz = sizeof(struct pkt_nd_connector_notify_disconnect); > + goto complete; //non-block > + > + // KYS: ALERT: ASYNCH Operation > + case IOCTL_ND_CONNECTOR_DISCONNECT: > + hvnd_debug("IOCTL_ND_CONNECTOR_DISCONNECT\n"); > + copy_sz = sizeof(struct pkt_nd_connector_notify_disconnect); > + goto complete; // non-block > + > + case IOCTL_ND_QP_CREATE: > + copy_sz = sizeof(struct pkt_nd_create_qp); > + break; > + > + case IOCTL_ND_MR_CREATE: > + copy_sz = sizeof(struct pkt_nd_create_mr); > + break; > + > + case IOCTL_ND_MR_FREE: > + copy_sz = sizeof(struct pkt_nd_free_handle); > + break; > + > + case IOCTL_ND_MR_REGISTER: > + copy_sz = sizeof(struct pkt_nd_register_mr); > + break; > + > + case IOCTL_ND_MR_DEREGISTER: > + copy_sz = sizeof(struct pkt_nd_deregister_mr); > + break; > + > + case IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST: > + copy_sz = sizeof(struct pkt_query_addr_list); > + break; > + > + case IOCTL_ND_QP_FLUSH: > + copy_sz = sizeof(struct pkt_nd_flush_qp); > + break; > + > + default: > + hvnd_warn("Got unknown ioctl: %d\n", > + ctrl_hdr->io_cntrl_code); > + copy_sz = 0; > + break; > + } > + > + break; > + default: > + hvnd_warn("Got an unknown packet type %d\n", *pkt_type); > + break; > + } > + > + hvnd_cookie = (struct hvnd_cookie *)requestid; > + memcpy(hvnd_cookie->pkt, incoming_pkt_start, copy_sz); > + complete(&hvnd_cookie->host_event); > + > +complete: > + /* send out ioctl completion patcket */ > + if (desc->flags & VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED) { > + int retry = 5; > + > + while (true) { > + int ret; > + > + ret = vmbus_sendpacket(dev->channel, NULL, 0, > + requestid, VM_PKT_COMP, 0); > + if (ret == 0) { > + break; > + } else if (ret == -EAGAIN) { > + if (--retry == 0) { > + hvnd_error("give up retrying send completion packet\n"); > + break; > + } > + hvnd_warn("retrying send completion packet\n"); > + udelay(100); > + } else { > + hvnd_error("unable to send completion packet ret=%d\n", > + ret); > + break; > + } > + } > + } > + > +} > diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/vmbus_rdma.h b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/vmbus_rdma.h > new file mode 100644 > index 000000000000..78e534fa8748 > --- /dev/null > +++ b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/vmbus_rdma.h > @@ -0,0 +1,2202 @@ > +/* > + * Copyright (c) 2014, Microsoft Corporation. > + * > + * Author: > + * K. Y. Srinivasan <kys@microsoft.com> > + * > + * This program is free software; you can redistribute it and/or modify it > + * under the terms of the GNU General Public License version 2 as published > + * by the Free Software Foundation. > + * > + * This program is distributed in the hope that it will be useful, but > + * WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or > + * NON INFRINGEMENT. See the GNU General Public License for more > + * details. > + * > + * Bug fixes/enhancements: Long Li <longli@microsoft.com> > + */ > + > + > +#ifndef _VMBUS_RDMA_H > +#define _VMBUS_RDMA_H > + > + > +#include <linux/in.h> > +#include <linux/in6.h> > +#include <rdma/ib_verbs.h> > +#include <linux/idr.h> > +#include <linux/if_ether.h> > + > +/* NetworkDirect version Numbers. > + */ > +#define ND_VERSION_1 0x1 > +#define ND_VERSION_2 0x20000 > + > +#ifndef NDVER > +#define NDVER ND_VERSION_2 > +#endif > + > +#define ND_ADAPTER_FLAG_IN_ORDER_DMA_SUPPORTED 0x00000001 > +#define ND_ADAPTER_FLAG_CQ_INTERRUPT_MODERATION_SUPPORTED 0x00000004 > +#define ND_ADAPTER_FLAG_MULTI_ENGINE_SUPPORTED 0x00000008 > +#define ND_ADAPTER_FLAG_CQ_RESIZE_SUPPORTED 0x00000100 > +#define ND_ADAPTER_FLAG_LOOPBACK_CONNECTIONS_SUPPORTED 0x00010000 > + > +#define ND_CQ_NOTIFY_ERRORS 0 > +#define ND_CQ_NOTIFY_ANY 1 > +#define ND_CQ_NOTIFY_SOLICITED 2 > + > +#define ND_MR_FLAG_ALLOW_LOCAL_WRITE 0x00000001 > +#define ND_MR_FLAG_ALLOW_REMOTE_READ 0x00000002 > +#define ND_MR_FLAG_ALLOW_REMOTE_WRITE 0x00000005 > +#define ND_MR_FLAG_RDMA_READ_SINK 0x00000008 > +#define ND_MR_FLAG_DO_NOT_SECURE_VM 0x80000000 > + > +#define ND_OP_FLAG_SILENT_SUCCESS 0x00000001 > +#define ND_OP_FLAG_READ_FENCE 0x00000002 > +#define ND_OP_FLAG_SEND_AND_SOLICIT_EVENT 0x00000004 > +#define ND_OP_FLAG_ALLOW_READ 0x00000008 > +#define ND_OP_FLAG_ALLOW_WRITE 0x00000010 > + > +#if NDVER >= ND_VERSION_2 > +#define ND_OP_FLAG_INLINE 0x00000020 > +#endif > + > +#define ND_AF_INET6 23 > +#define IF_MAX_ADDR_LENGTH 32 > + > +struct group_affinity { > + u64 mask; //KYS: usually 0 > + u16 group; // KYS usually -1 > + u16 reserved[3]; > +}; > + > +struct if_physical_addr { > + u16 length; > + u8 addr[IF_MAX_ADDR_LENGTH]; > +}; > + > +struct adapter_info_v2 { > + u32 info_version; > + u16 vendor_id; > + u16 device_id; > + u64 adapter_id; > + size_t max_registration_size; > + size_t max_window_size; > + u32 max_initiator_sge; > + u32 max_recv_sge; > + u32 max_read_sge; > + u32 max_transfer_length; > + u32 max_inline_data_size; > + u32 max_inbound_read_limit; > + u32 max_outbound_read_limit; > + u32 max_recv_q_depth; > + u32 max_initiator_q_depth; > + u32 max_shared_recv_q_depth; > + u32 max_completion_q_depth; > + u32 inline_request_threshold; > + u32 large_request_threshold; > + u32 max_caller_data; > + u32 max_callee_data; > + u32 adapter_flags; > +} __packed; > + > +struct nd2_adapter_info_32 { //KYS: Check what this is > + u32 info_version; > + u16 vendor_id; > + u16 devic_id; > + u64 adapter_id; > + u32 max_registration_size; > + u32 max_window_size; > + u32 max_initiator_sge; > + u32 max_recv_sge; > + u32 max_read_sge; > + u32 max_transfer_length; > + u32 max_inline_data_size; > + u32 max_inbound_read_limit; > + u32 max_outbound_read_limit; > + u32 max_recv_q_depth; > + u32 max_initiator_q_depth; > + u32 max_shared_recv_q_depth; > + u32 max_completion_q_depth; > + u32 inline_request_threshold; > + u32 large_request_threshold; > + u32 max_caller_data; > + u32 max_callee_data; > + u32 adapter_flags; > +} __packed; > + > +enum nd2_request_type { > + ND2_RT_RECEIVE, > + ND2_RT_SEND, > + ND2_RT_BIND, > + ND2_RT_INVALIDATE, > + ND2_RT_READ, > + ND2_RT_WRITE > +}; > + > +struct nd2_result { > + u32 status; > + u32 bytes_transferred; > + void *qp_ctx; > + void *request_ctx; > + enum nd2_request_type request_type; > +} __packed; > + > +struct nd2_sge { > + void *buffer; > + u32 buffer_length; > + u32 mr_token; > +} __packed; > + > +/* > + * The communication with the host via ioctls using VMBUS > + * as the transport. > + */ > + > +#define ND_IOCTL_VERSION 1 > + > +enum nd_mapping_type { > + ND_MAP_IOSPACE, > + ND_MAP_MEMORY, > + ND_MAP_MEMORY_COALLESCE, > + ND_MAP_PAGES, > + ND_MAP_PAGES_COALLESCE, > + ND_UNMAP_IOSPACE, > + ND_UNMAP_MEMORY, > + ND_MAX_MAP_TYPE > +}; > + > +enum nd_caching_type { > + ND_NON_CACHED = 0, > + ND_CACHED, > + ND_WRITE_COMBINED, > + ND_MAX_CACHE_TYPE > +}; > + > +enum nd_aceess_type { > + ND_READ_ACCESS = 0, > + ND_WRITE_ACCESS, > + ND_MODIFY_ACCESS > +}; > + > +struct nd_map_io_space { > + enum nd_mapping_type map_type; > + enum nd_caching_type cache_type; > + u32 cb_length; > +}; > + > +struct nd_map_memory { > + enum nd_mapping_type map_type; > + enum nd_aceess_type access_type; > + u64 address; > + u32 cb_length; > +}; > + > +struct nd_mapping_id { > + enum nd_mapping_type map_type; > + u64 id; > +}; > + > +struct ndk_map_pages { > + struct nd_map_memory header; > + u32 page_offset; > +}; > + > +union nd_mapping { > + enum nd_mapping_type map_type; > + struct nd_map_io_space map_io_space; > + struct nd_map_memory map_memory; > + struct nd_mapping_id mapping_id; > + struct ndk_map_pages map_pages; > +}; > + > +struct nd_mapping_result { > + u64 id; > + u64 info; > +}; > + > +struct nd_resource_descriptor { > + u64 handle; > + u32 ce_mapping_results; > + u32 cb_mapping_results_offset; > +}; > + > +struct nd_handle { > + u32 version; > + u32 reserved; > + u64 handle; > +}; > + > +union nd_sockaddr_inet { > + struct sockaddr_in ipv4; > + struct sockaddr_in6 ipv6; > + u16 address_family; //KYS how is this supposed to work? > +}; > + > +struct nd_address_element { > + union nd_sockaddr_inet addr; > + char mac_addr[ETH_ALEN]; > +}; > + > +struct nd_resolve_address { > + u32 version; > + u32 reserved; > + union nd_sockaddr_inet address; > +}; > + > +struct nd_open_adapter { > + u32 version; > + u32 reserved; > + u32 ce_mapping_cnt; > + u32 cb_mapping_offset; > + u64 adapter_id; > +}; > + > +struct nd_adapter_query { > + u32 version; > + u32 info_version; > + u64 adapter_handle; > +}; > + > +struct nd_create_cq { > + u32 version; > + u32 queue_depth; > + u32 ce_mapping_cnt; > + u32 cb_mapping_offset; > + u64 adapter_handle; > + struct group_affinity affinity; > +}; > + > +struct nd_create_srq { > + u32 version; > + u32 queue_depth; > + u32 ce_mapping_cnt; > + u32 cb_mapping_offset; > + u32 max_request_sge; > + u32 notify_threshold; > + u64 pd_handle; > + struct group_affinity affinity; > +}; > + > +struct nd_create_qp_hdr { > + u32 version; > + u32 cb_max_inline_data; > + u32 ce_mapping_cnt; > + u32 cb_mapping_offset; //KYS: what is this prefix - ce/cb > + u32 initiator_queue_depth; > + u32 max_initiator_request_sge; > + u64 receive_cq_handle; > + u64 initiator_cq_handle; > + u64 pd_handle; > +}; > + > +struct nd_create_qp { > + struct nd_create_qp_hdr hdr; > + u32 receive_queue_depth; > + u32 max_receive_request_sge; > +}; > + > +struct nd_create_qp_with_srq { > + struct nd_create_qp_hdr header; > + u64 srq_handle; > +}; > + > +struct nd_srq_modify { > + u32 version; > + u32 queue_depth; > + u32 ce_mapping_cnt; > + u32 cb_mapping_offset; > + u32 notify_threshold; > + u32 reserved; > + u64 srq_handle; > +}; > + > +struct nd_cq_modify { > + u32 version; > + u32 queue_depth; > + u32 ce_mapping_count; > + u32 cb_mappings_offset; > + u64 cq_handle; > +}; > + > +struct nd_cq_notify { > + u32 version; > + u32 type; > + u64 cq_handle; > +}; > + > +struct nd_mr_register_hdr { > + u32 version; > + u32 flags; > + u64 cb_length; > + u64 target_addr; > + u64 mr_handle; > +}; > + > +struct nd_mr_register { > + struct nd_mr_register_hdr header; > + u64 address; > +}; > + > +struct nd_bind { > + u32 version; > + u32 reserved; > + u64 handle; > + union nd_sockaddr_inet address; > +}; > + > +struct nd_read_limits { > + u32 inbound; > + u32 outbound; > +}; > + > +struct nd_connect { > + u32 version; > + u32 reserved; > + struct nd_read_limits read_limits; > + u32 cb_private_data_length; > + u32 cb_private_data_offset; > + u64 connector_handle; > + u64 qp_handle; > + union nd_sockaddr_inet destination_address; > + struct if_physical_addr phys_addr; > +}; > + > +struct nd_accept { > + u32 version; > + u32 reserved; > + struct nd_read_limits read_limits; > + u32 cb_private_data_length; > + u32 cb_private_data_offset; > + u64 connector_handle; > + u64 qp_handle; > +}; > + > +struct nd_reject { > + u32 version; > + u32 reserved; > + u32 cb_private_data_length; > + u32 cb_private_data_offset; > + u64 connector_handle; > +}; > + > +struct nd_listen { > + u32 version; > + u32 back_log; > + u64 listener_handle; > +}; > + > +struct nd_get_connection_request { > + u32 version; > + u32 reserved; > + u64 listener_handle; > + u64 connector_handle; > +}; > + > +enum ndv_mmio_type { > + ND_PARTITION_KERNEL_VIRTUAL, > + ND_PARTITION_SYSTEM_PHYSICAL, > + ND_PARTITION_GUEST_PHYSICAL, > + ND_MAXIMUM_MMIO_TYPE > +}; > + > +struct ndv_resolve_adapter_id { > + u32 version; > + struct if_physical_addr phys_addr; > +}; > + > +struct ndv_partition_create { > + u32 version; > + enum ndv_mmio_type mmio_type; > + u64 adapter_id; > + u64 xmit_cap; > +}; > + > +struct ndv_partition_bind_luid { > + u32 version; > + u32 reserved; > + u64 partition_handle; > + struct if_physical_addr phys_addr; > + //IF_LUID luid; //KYS? > +}; > + > +struct ndv_partition_bind_address { > + u32 version; > + u32 reserved; > + u64 partition_handle; > + union nd_sockaddr_inet address; > + struct if_physical_addr guest_phys_addr; > + struct if_physical_addr phys_addr; > +}; > + > +struct ndk_mr_register { > + struct nd_mr_register_hdr hdr; > + u32 cb_logical_page_addresses_offset; > +}; > + > +struct ndk_bind { > + struct nd_bind hdr; > + u64 authentication_id; > + bool is_admin; > +}; > + > +#define FDN 0x12 > +#define METHOD_BUFFERED 0x0 > +#define FAA 0x0 > + > +#define CTL_CODE( DeviceType, Function, Method, Access ) ( \ > + ((DeviceType) << 16) | ((Access) << 14) | ((Function) << 2) | (Method) \ > +) > + > +#define ND_FUNCTION(r_, i_) ((r_) << 6 | (i_)) > +#define IOCTL_ND(r_, i_) \ > + CTL_CODE( FDN, ND_FUNCTION((r_), (i_)), METHOD_BUFFERED, FAA ) > + > +#define ND_FUNCTION_FROM_CTL_CODE(ctrlCode_) ((ctrlCode_ >> 2) & 0xFFF) > +#define ND_RESOURCE_FROM_CTL_CODE(ctrlCode_) (ND_FUNCTION_FROM_CTL_CODE(ctrlCode_) >> 6) > +#define ND_OPERATION_FROM_CTRL_CODE(ctrlCode_) (ND_FUNCTION_FROM_CTL_CODE(ctrlCode_) & 0x3F) > + > +#define ND_DOS_DEVICE_NAME L"\\DosDevices\\Global\\NetworkDirect" > +#define ND_WIN32_DEVICE_NAME L"\\\\.\\NetworkDirect" > + > +enum nd_resource_type { > + ND_PROVIDER = 0, > + ND_ADAPTER, > + ND_PD, > + ND_CQ, > + ND_MR, > + ND_MW, > + ND_SRQ, > + ND_CONNECTOR, > + ND_LISTENER, > + ND_QP, > + ND_VIRTUAL_PARTITION, > + ND_RESOURCE_TYPE_COUNT > +}; > + > +#define ND_OPERATION_COUNT 14 > + > +#define IOCTL_ND_PROVIDER(i_) IOCTL_ND(ND_PROVIDER, i_) > +#define IOCTL_ND_ADAPTER(i_) IOCTL_ND(ND_ADAPTER, i_) > +#define IOCTL_ND_PD(i_) IOCTL_ND(ND_PD, i_) > +#define IOCTL_ND_CQ(i_) IOCTL_ND(ND_CQ, i_) > +#define IOCTL_ND_MR(i_) IOCTL_ND(ND_MR, i_) > +#define IOCTL_ND_MW(i_) IOCTL_ND(ND_MW, i_) > +#define IOCTL_ND_SRQ(i_) IOCTL_ND(ND_SRQ, i_) > +#define IOCTL_ND_CONNECTOR(i_) IOCTL_ND(ND_CONNECTOR, i_) > +#define IOCTL_ND_LISTENER(i_) IOCTL_ND(ND_LISTENER, i_) > +#define IOCTL_ND_QP(i_) IOCTL_ND(ND_QP, i_) > +#define IOCTL_ND_VIRTUAL_PARTITION(i_) IOCTL_ND(ND_VIRTUAL_PARTITION, i_) > + > +/* Provider IOCTLs */ > +#define IOCTL_ND_PROVIDER_INIT IOCTL_ND_PROVIDER( 0 ) > +#define IOCTL_ND_PROVIDER_BIND_FILE IOCTL_ND_PROVIDER( 1 ) > +#define IOCTL_ND_PROVIDER_QUERY_ADDRESS_LIST IOCTL_ND_PROVIDER( 2 ) > +#define IOCTL_ND_PROVIDER_RESOLVE_ADDRESS IOCTL_ND_PROVIDER( 3 ) > +#define IOCTL_ND_PROVIDER_MAX_OPERATION 4 > + > +/* Adapter IOCTLs */ > +#define IOCTL_ND_ADAPTER_OPEN IOCTL_ND_ADAPTER( 0 ) > +#define IOCTL_ND_ADAPTER_CLOSE IOCTL_ND_ADAPTER( 1 ) > +#define IOCTL_ND_ADAPTER_QUERY IOCTL_ND_ADAPTER( 2 ) > +#define IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST IOCTL_ND_ADAPTER( 3 ) > +#define IOCTL_ND_ADAPTER_MAX_OPERATION 4 > + > +/* Protection Domain IOCTLs */ > +#define IOCTL_ND_PD_CREATE IOCTL_ND_PD( 0 ) > +#define IOCTL_ND_PD_FREE IOCTL_ND_PD( 1 ) > +#define IOCTL_ND_PD_MAX_OPERATION 2 > + > +/* Completion Queue IOCTLs */ > +#define IOCTL_ND_CQ_CREATE IOCTL_ND_CQ( 0 ) > +#define IOCTL_ND_CQ_FREE IOCTL_ND_CQ( 1 ) > +#define IOCTL_ND_CQ_CANCEL_IO IOCTL_ND_CQ( 2 ) > +#define IOCTL_ND_CQ_GET_AFFINITY IOCTL_ND_CQ( 3 ) > +#define IOCTL_ND_CQ_MODIFY IOCTL_ND_CQ( 4 ) > +#define IOCTL_ND_CQ_NOTIFY IOCTL_ND_CQ( 5 ) > +#define IOCTL_ND_CQ_MAX_OPERATION 6 > + > +/* Memory Region IOCTLs */ > +#define IOCTL_ND_MR_CREATE IOCTL_ND_MR( 0 ) > +#define IOCTL_ND_MR_FREE IOCTL_ND_MR( 1 ) > +#define IOCTL_ND_MR_CANCEL_IO IOCTL_ND_MR( 2 ) > +#define IOCTL_ND_MR_REGISTER IOCTL_ND_MR( 3 ) > +#define IOCTL_ND_MR_DEREGISTER IOCTL_ND_MR( 4 ) > +#define IOCTL_NDK_MR_REGISTER IOCTL_ND_MR( 5 ) > +#define IOCTL_ND_MR_MAX_OPERATION 6 > + > +/* Memory Window IOCTLs */ > +#define IOCTL_ND_MW_CREATE IOCTL_ND_MW( 0 ) > +#define IOCTL_ND_MW_FREE IOCTL_ND_MW( 1 ) > +#define IOCTL_ND_MW_MAX_OPERATION 2 > + > +/* Shared Receive Queue IOCTLs */ > +#define IOCTL_ND_SRQ_CREATE IOCTL_ND_SRQ( 0 ) > +#define IOCTL_ND_SRQ_FREE IOCTL_ND_SRQ( 1 ) > +#define IOCTL_ND_SRQ_CANCEL_IO IOCTL_ND_SRQ( 2 ) > +#define IOCTL_ND_SRQ_GET_AFFINITY IOCTL_ND_SRQ( 3 ) > +#define IOCTL_ND_SRQ_MODIFY IOCTL_ND_SRQ( 4 ) > +#define IOCTL_ND_SRQ_NOTIFY IOCTL_ND_SRQ( 5 ) > +#define IOCTL_ND_SRQ_MAX_OPERATION 6 > + > +/* Connector IOCTLs */ > +#define IOCTL_ND_CONNECTOR_CREATE IOCTL_ND_CONNECTOR( 0 ) > +#define IOCTL_ND_CONNECTOR_FREE IOCTL_ND_CONNECTOR( 1 ) > +#define IOCTL_ND_CONNECTOR_CANCEL_IO IOCTL_ND_CONNECTOR( 2 ) > +#define IOCTL_ND_CONNECTOR_BIND IOCTL_ND_CONNECTOR( 3 ) > +#define IOCTL_ND_CONNECTOR_CONNECT IOCTL_ND_CONNECTOR( 4 ) > +#define IOCTL_ND_CONNECTOR_COMPLETE_CONNECT IOCTL_ND_CONNECTOR( 5 ) > +#define IOCTL_ND_CONNECTOR_ACCEPT IOCTL_ND_CONNECTOR( 6 ) > +#define IOCTL_ND_CONNECTOR_REJECT IOCTL_ND_CONNECTOR( 7 ) > +#define IOCTL_ND_CONNECTOR_GET_READ_LIMITS IOCTL_ND_CONNECTOR( 8 ) > +#define IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA IOCTL_ND_CONNECTOR( 9 ) > +#define IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS IOCTL_ND_CONNECTOR( 10 ) > +#define IOCTL_ND_CONNECTOR_GET_ADDRESS IOCTL_ND_CONNECTOR( 11 ) > +#define IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT IOCTL_ND_CONNECTOR( 12 ) > +#define IOCTL_ND_CONNECTOR_DISCONNECT IOCTL_ND_CONNECTOR( 13 ) > +#define IOCTL_ND_CONNECTOR_MAX_OPERATION 14 > + > +/* Listener IOCTLs */ > +#define IOCTL_ND_LISTENER_CREATE IOCTL_ND_LISTENER( 0 ) > +#define IOCTL_ND_LISTENER_FREE IOCTL_ND_LISTENER( 1 ) > +#define IOCTL_ND_LISTENER_CANCEL_IO IOCTL_ND_LISTENER( 2 ) > +#define IOCTL_ND_LISTENER_BIND IOCTL_ND_LISTENER( 3 ) > +#define IOCTL_ND_LISTENER_LISTEN IOCTL_ND_LISTENER( 4 ) > +#define IOCTL_ND_LISTENER_GET_ADDRESS IOCTL_ND_LISTENER( 5 ) > +#define IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST IOCTL_ND_LISTENER( 6 ) > +#define IOCTL_ND_LISTENER_MAX_OPERATION 7 > + > +/* Queue Pair IOCTLs */ > +#define IOCTL_ND_QP_CREATE IOCTL_ND_QP( 0 ) > +#define IOCTL_ND_QP_CREATE_WITH_SRQ IOCTL_ND_QP( 1 ) > +#define IOCTL_ND_QP_FREE IOCTL_ND_QP( 2 ) > +#define IOCTL_ND_QP_FLUSH IOCTL_ND_QP( 3 ) > +#define IOCTL_ND_QP_MAX_OPERATION 4 > + > +/* Kernel-mode only IOCTLs (IRP_MJ_INTERNAL_DEVICE_CONTROL) */ > +#define IOCTL_NDV_PARTITION_RESOLVE_ADAPTER_ID IOCTL_ND_VIRTUAL_PARTITION( 0 ) > +#define IOCTL_NDV_PARTITION_CREATE IOCTL_ND_VIRTUAL_PARTITION( 1 ) > +#define IOCTL_NDV_PARTITION_FREE IOCTL_ND_VIRTUAL_PARTITION( 2 ) > +#define IOCTL_NDV_PARTITION_BIND IOCTL_ND_VIRTUAL_PARTITION( 3 ) > +#define IOCTL_NDV_PARTITION_UNBIND IOCTL_ND_VIRTUAL_PARTITION( 4 ) > +#define IOCTL_NDV_PARTITION_BIND_LUID IOCTL_ND_VIRTUAL_PARTITION( 5 ) > +#define IOCTL_NDV_PARTITION_MAX_OPERATION 6 > + > + > +#define MB_SHIFT 20 > + > + > +/* Ringbuffer size for the channel */ > +#define NDV_NUM_PAGES_IN_RING_BUFFER 64 > + > +#define NDV_MAX_PACKETS_PER_RECEIVE 8 > + > +#define NDV_MAX_PACKET_COUNT 16304 > + > +#define NDV_MAX_NUM_OUTSTANDING_RECEIVED_PACKETS (16304) > +#define NDV_MAX_HANDLE_TABLE_SIZE (16304) > +#define NDV_HOST_MAX_HANDLE_TABLE_SIZE (NDV_MAX_HANDLE_TABLE_SIZE * 16) > + > + > +#define NDV_MAX_MAPPINGS 4 > + > +#define NDV_STATE_NONE 0x00000000 > +#define NDV_STATE_CREATED 0x00000001 > +#define NDV_STATE_CONNECTING 0x00000002 > +#define NDV_STATE_INITIALIZING 0x00000003 > +#define NDV_STATE_OPERATIONAL 0xEFFFFFFF > +#define NDV_STATE_FAILED 0xFFFFFFFF > + > + > +#define NDV_MAX_PRIVATE_DATA_SIZE 64 > +#define NDV_MAX_IOCTL_SIZE 256 > + > +/* max size of buffer for vector of ND_MAPPING */ > +#define NDV_MAX_MAPPING_BUFFER_SIZE \ > + (NDV_MAX_MAPPINGS * sizeof(union nd_mapping)) > + > +/* max expected ioctl buffer size from users */ > +#define NDV_MAX_IOCTL_BUFFER_SIZE \ > + (NDV_MAX_IOCTL_SIZE + \ > + NDV_MAX_MAPPING_BUFFER_SIZE + \ > + NDV_MAX_PRIVATE_DATA_SIZE) > + > +/* max PFN array for inline buffers */ > +#define NDV_MAX_INLINE_PFN_ARRAY_LENGTH 32 > + > +/* Field header size for inline buffer */ > +#define NDV_MAX_MAPPING_PACKET_FILED_BUFFER_SIZE \ > + (NDV_MAX_MAPPINGS * sizeof(NDV_PACKET_FIELD)) > + > +/* Max for a single field */ > + > +#define NDV_MAX_SINGLE_MAPPING_FIELD ( sizeof(GPA_RANGE) + \ > + (sizeof(PFN_NUMBER) * NDV_MAX_INLINE_PFN_ARRAY_LENGTH)) > + > +/* Max for all inine data */ > + > +#define NDV_MAX_MAPPING_DATA_SIZE (NDV_MAX_MAPPING_PACKET_FILED_BUFFER_SIZE + \ > + (NDV_MAX_MAPPINGS * NDV_MAX_SINGLE_MAPPING_FIELD)) > + > + > +#define NDV_MAX_PACKET_HEADER_SIZE 256 > + > +#define NDV_MAX_PACKET_SIZE (NDV_MAX_PACKET_HEADER_SIZE + \ > + NDV_MAX_IOCTL_BUFFER_SIZE + \ > + NDV_MAX_MAPPING_DATA_SIZE) > + > +/* Well known message type INIT is defined for the channel > + * not for the protocol. > + */ > + > +#define NDV_PACKET_TYPE_INIT 0xFFFFFFFF > + > +/* Invalid protocol version to to identify uninitialized channels */ > + > +#define NDV_PROTOCOL_VERSION_INVALID 0xFFFFFFFF > + > +/* Flags that control the bahavior of packet handling */ > + > +enum ndv_packet_options { > + NDV_PACKET_OPTION_NONE = 0x00, > + > + /* Indicates that the ExternalDataMdl parameter is expectected to be > + * passed and must be handled in the reciever. This call must be > + * handled specially to ensure that the MDL can be created correctly. > + */ > + NDV_PACKET_OPTION_EXTERNAL_DATA = 0x01, > + > + /* Inicates that the reciever must execution the handler at passive. */ > + NDV_PACKET_OPTIONS_REQUIRES_PASSIVE = 0x02, > + > + /* Indicates that the sender does not expect and is not waiting for a > + * response packet. > + */ > + NDV_PACKET_OPTIONS_POST = 0x04, > +}; > + > +#define NDV_PACKET_TYPE(id_, opt_) \ > + (((opt_)<<24) | (id_)) > + > +#define NDV_PACKET_TYPE_OPTIONS(type_) \ > + (((type_) >> 24) & 0xFF) > + > +#define NDV_PACKET_TYPE_ID(type_) \ > + ((type_) & 0xFFFFFF) \ > + > +#define NDV_ADD_PACKET_OPTION(type_, opt_) \ > + (type_) |= (opt_<<24) > + > +/* The header value sent on all packets */ > +union ndv_packet_hdr { > + > + struct { > + /* The type of packet. > + * This value should be created with the NDV_PACKET_TYPE macro > + * to include all packet options within the packet type. > + */ > + u32 packet_type; > + /* The size of the entire fixed message structure that exists > + * before the data. This must be >= sizeof(NDV_PACKET_HEADER) > + */ > + u32 hdr_sz; > + /* This size of the data that follows the message > + * data_sz + hdr_sz size gives the total size of > + * the buffer that is used. > + */ > + u32 data_sz; > + /* The status code used to indicate success or failure. > + * It is only used in completions and during responses. > + */ > + u32 status; //KYS: NTSTATUS? > + }; > + > + u64 padding[2]; //KYS: why? > +}; > + > + > +/* The core INIT packet. This message is defined in the channel > + * not in the protocol. This message should never change size > + * or behavior, as it could impact compatibility in the future. > + * This packet is used to negotiate the protocol version, so chaning > + * this size could break backward compat. > + */ > + > +union ndv_packet_init { > + struct { > + u32 packet_type; > + u32 protocol_version; > + u32 flags; > + }; > + u64 padding[2]; > +} __packed; > + > +#define NDV_PACKET_INIT_SIZE 16 > + > +/* Data packing flags used for accessing the dynamic fields inside a packet */ > +#define NDV_DATA_PACKING_2 0x1 > +#define NDV_DATA_PACKING_4 0x3 > +#define NDV_DATA_PACKING_8 0x7 > + > + > +#define NDV_PROTOCOL_VERSION_1 0x0100 > +#define NDV_PROTOCOL_VERSION_CURRENT NDV_PROTOCOL_VERSION_1 > +#define NDV_PROTOCOL_VERSION_COUNT 1 > + > +struct ndv_pkt_field { > + u32 size; > + u32 offset; > +}; > + > +enum ndv_pkt_id { > + NDV_PKT_UNKNOWN = 0, > + /* Version 1 Message ID's */ > + NDV_PKT_ID1_BIND, > + NDV_PKT_ID1_CREATE, > + NDV_PKT_ID1_CLEANUP, > + NDV_PKT_ID1_CANCEL, > + NDV_PKT_ID1_CONTROL, > + NDV_PKT_ID1_COMPLETE, > + NDV_PKT_ID1_INIT_RESOURCES, > +}; > + > +/* The guest will send this as the first messages just after init > + * The resources are reserved per channel. > + */ > + > +struct ndv_pkt_hdr_init_resources_1 { > + > + union ndv_packet_hdr pkt_hdr; > + u16 io_space_sz_mb; > + u64 io_space_start; > + > +}; > + > + > + > +/* The guest will send this packet to the host after channel init > + * to query support for the adapters that are registered. > + */ > + > +struct ndv_pkt_hdr_bind_1 { > + union ndv_packet_hdr pkt_hdr; > + bool unbind; > + union nd_sockaddr_inet ip_address; > + struct if_physical_addr phys_addr; > + u64 guest_id; > +}; > + > +union ndv_context_handle { > + u64 val64; > + struct { > + u32 local; > + u32 remote; > + }; > +}; > + > +struct ndv_pkt_hdr_create_1 { > + union ndv_packet_hdr pkt_hdr; > + > + /* Identifies the object used to track this file handle on both > + * the guest and the host. When sent from the guest, it will contain > + * the guest handle. On success, the host will populate and return > + * it's handle value as well. > + */ > + > + union ndv_context_handle handle; > + > + /* The parameters sent to the CreateFile call */ > + u32 access_mask; > + u32 open_options; > + > + u16 file_attributes; //KYS: This field must be 64 bit aligned > + > + u16 share_access; //KYS > + > + u32 kys_padding; //KYS > + > + u16 ea_length; //KYS; needs to be 64 bit aligned; what is ea length - unused > +}; > + > + > +struct ndv_pkt_hdr_cleanup_1 { > + union ndv_packet_hdr pkt_hdr; > + > + /* Identifies the object used to track this file handle on both > + * the guest and the host. When sent from the guest, it will contain > + * the both the guest and host handle values. The host will use this > + * value to cleanup its resource, then update its portion of the handle > + * to NDV_HANDLE_NULL before returning the data back to the guest. > + */ > + union ndv_context_handle handle; > +}; > + > +struct ndv_pkt_hdr_cancel_1 { > + union ndv_packet_hdr pkt_hdr; > + union ndv_context_handle file_handle; > + union ndv_context_handle irp_handle; > +}; > + > +struct ndv_bind_port_info { > + //LUID authentication_id; //KYS: LUID? > + bool is_admin; > +}; > + > +struct ndv_extended_data_flds { > + union { > + u32 field_count; > + u64 padding; > + }; > + > + //struct ndv_pkt_field fields[ANYSIZE_ARRAY]; //KYS? > +}; > + > + > +struct ndv_packet_hdr_control_1 { > + union ndv_packet_hdr pkt_hdr; > + /* Identifies the object used to track this file handle on both > + * the guest and the host. This should always have both guest > + * and host handle values inside it. > + */ > + > + union ndv_context_handle file_handle; > + > + /* The handle information for the allocated irp context object. > + * This information is used when the host/guest starts the cancelation > + */ > + union ndv_context_handle irp_handle; > + > + /* The input data describing in the IO control parameters */ > + > + u32 io_cntrl_code; > + u32 output_buf_sz; > + u32 input_buf_sz; > + u32 input_output_buf_offset; > + > + /* These are used in the return message to indicate the status of the IO > + * operation and the amount of data written to the output buffer. > + */ > + u32 io_status; //KYS: NTSTATUS? > + u32 bytes_returned; > + > + /* This contains the field information for additional data that is sent > + * with the packet that is IOCTL specific. > + */ > + > + struct ndv_pkt_field extended_data; > +}; > + > +/* > + * Include MLX specific defines. > + */ > + > +#include "mx_abi.h" > + > +/* Driver specific state. > + */ > + > +/* > + * We need to have host open a file; some > + * Windows constants for open. > + */ > +#define STANDARD_RIGHTS_ALL (0x001F0000L) > +#define FILE_ATTRIBUTE_NORMAL (0x80) > +#define FILE_SHARE_READ (0x00000001) > +#define FILE_SHARE_WRITE (0x00000002) > +#define FILE_SHARE_DELETE (0x00000004) > +#define FILE_FLAG_OVERLAPPED (0x40000000) > +#define FILE_SHARE_ALL (FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE) > +#define CREATE_ALWAYS (2) > +#define OPEN_EXISTING (3) > + > +#define RTL_NUMBER_OF(_x) \ > + sizeof(_x)/sizeof(_x[0]) > +/* > + * The context structure tracks the open state. > + */ > + > +/* > + * Packet layout for open adaptor. > + */ > + > +/* > + * Packet for querying the address list. > + */ > + > +union query_addr_list_ioctl { > + struct nd_handle in; > + union nd_sockaddr_inet out[16]; //KYS a max of 16 addresses > +}; > + > +struct pkt_query_addr_list { > + struct ndv_packet_hdr_control_1 hdr; > + union query_addr_list_ioctl ioctl; > + unsigned long activity_id; > +}; > + > + > +struct pkt_fld { > + u32 size; > + u32 offset; > +}; > + > +struct fld_data { > + union { > + u64 padding; > + }; > +}; > + > +struct extended_data_oad { > + union { > + u32 cnt; > + u64 padding; > + }; > + /* offsets are from start of extended data struct > + * and should start on 8 byte boundary > + */ > + struct pkt_fld fields[IBV_GET_CONTEXT_MAPPING_MAX]; > +}; > + > +union oad_ioctl { > + struct nd_open_adapter input; > + struct nd_resource_descriptor resrc_desc; > +}; > + > +union oad_mappings { > + struct ibv_get_context_req ctx_input; > + struct ibv_get_context_resp ctx_output; > +}; > + > +struct pkt_nd_open_adapter { > + struct ndv_packet_hdr_control_1 hdr; > + > + union oad_ioctl ioctl; > + union oad_mappings mappings; > + > + /* > + * Extended data. > + */ > + struct extended_data_oad ext_data; > +}; > + > +/* > + * Create CQ IOCTL. > + */ > + > +struct cq_db_gpa { > + u32 byte_count; > + u32 byte_offset; > + u64 pfn_array[2]; > +}; > + > +struct cq_sn_gpa { > + u32 byte_count; > + u32 byte_offset; > + u64 pfn_array[2]; > +}; > + > +struct create_cq_ext_data { > + union { > + u32 cnt; > + u64 padding; > + }; > + /* offsets are from start of extended data struct > + * and should start on 8 byte boundary > + */ > + struct pkt_fld fields[MLX4_IB_CREATE_CQ_MAPPING_MAX]; > + struct cq_db_gpa db_gpa; > + struct cq_sn_gpa sn_gpa; > + struct gpa_range cqbuf_gpa; > +}; > + > +union create_cq_ioctl { > + struct nd_create_cq input; > + struct nd_resource_descriptor resrc_desc; > +}; > + > +union create_cq_mappings { > + struct ibv_create_cq cq_in; > + struct ibv_create_cq_resp cq_resp; > +}; > + > +struct pkt_nd_create_cq { > + struct ndv_packet_hdr_control_1 hdr; > + > + union create_cq_ioctl ioctl; > + union create_cq_mappings mappings; > + > + /* > + * Extended data. > + */ > + struct create_cq_ext_data ext_data; > +}; > + > +/* > + * IOCTL to free CQ. > + */ > +struct free_cq_ioctl { > + struct nd_handle in; > +}; > + > +struct pkt_nd_free_cq { > + struct ndv_packet_hdr_control_1 hdr; > + > + struct free_cq_ioctl ioctl; > +}; > + > + > +/* > + * IOCTL to QUERY CQ - CQ NOTIFY > + */ > + > +struct notify_cq_ioctl { > + struct nd_cq_notify in; > +}; > + > +struct pkt_nd_notify_cq { > + struct ndv_packet_hdr_control_1 hdr; > + struct notify_cq_ioctl ioctl; > +}; > + > +/* > + * IOCTL to Create a listner > + */ > + > +struct nd_ep_create { > + struct nd_handle hdr; > + bool to_semantics; > + unsigned long activity_id; > +}; > + > +union listener_cr_ioctl { > + struct nd_ep_create in; > + u64 out; > +}; > + > +struct pkt_nd_cr_listener { > + struct ndv_packet_hdr_control_1 hdr; > + union listener_cr_ioctl ioctl; > +}; > + > +/* > + * IOCTL to free listener. > + */ > + > +struct listener_free_ioctl { > + struct nd_handle in; > +}; > + > +struct pkt_nd_free_listener { > + struct ndv_packet_hdr_control_1 hdr; > + struct listener_free_ioctl ioctl; > +}; > + > +/* > + * IOCTL for listener cancel IO. > + */ > +struct listener_cancelio_ioctl { > + struct nd_handle in; > +}; > + > +struct pkt_nd_cancelio_listener { > + struct ndv_packet_hdr_control_1 hdr; > + struct listener_cancelio_ioctl ioctl; > +}; > + > +/* > + * IOCTL for LISTENER BIND > + */ > + > +union listener_bind_ioctl { > + struct ndk_bind in; > +}; > + > +struct pkt_nd_bind_listener { > + struct ndv_packet_hdr_control_1 hdr; > + union listener_bind_ioctl ioctl; > +}; > + > +/* > + * After the listener is bound, enable > + * listening. > + */ > + > +union listener_listen_ioctl { > + struct nd_listen in; > +}; > + > +struct pkt_nd_listen_listener { > + struct ndv_packet_hdr_control_1 hdr; > + union listener_listen_ioctl ioctl; > +}; > + > +/* > + * IOCTL for getting the adddress from listener. > + * > + */ > + > +union listener_get_addr_ioctl { > + struct nd_handle in; > + union nd_sockaddr_inet out; > +}; > + > +struct pkt_nd_get_addr_listener { > + struct ndv_packet_hdr_control_1 hdr; > + union listener_get_addr_ioctl ioctl; > +}; > + > +/* > + * IOCTL to get a connection from a listener. > + */ > + > +union listener_get_connection_ioctl { > + struct nd_get_connection_request in; > + union nd_sockaddr_inet out; > +}; > + > +struct pkt_nd_get_connection_listener { > + struct ndv_packet_hdr_control_1 hdr; > + union listener_get_connection_ioctl ioctl; > +}; > + > + > +/* > + * Connector IOCTLs > + */ > + > +/* > + * IOCTL to create connector. > + */ > + > +union connector_cr_ioctl { //KYS should this be a union or struct? > + struct nd_ep_create in; > + u64 out; > +}; > + > +struct pkt_nd_cr_connector { > + struct ndv_packet_hdr_control_1 hdr; > + union connector_cr_ioctl ioctl; //KYS: union or struct > +}; > + > +/* > + * IOCTL to free connector. > + */ > + > +struct connector_free_ioctl { > + struct nd_handle in; > +}; > + > +struct pkt_nd_free_connector { > + struct ndv_packet_hdr_control_1 hdr; > + struct connector_free_ioctl ioctl; > +}; > + > +/* > + * IOCTL to cancel I/O on a connector. > + */ > + > +struct connector_cancelio_ioctl { > + struct nd_handle in; > +}; > + > +struct pkt_nd_cancelio_connector { > + struct ndv_packet_hdr_control_1 hdr; > + struct connector_cancelio_ioctl ioctl; > +}; > + > +/* > + * IOCTL to Bind an address to the connector. > + */ > + > +union connector_bind_ioctl { > + struct ndk_bind in; > +}; > + > +struct pkt_nd_bind_connector { > + struct ndv_packet_hdr_control_1 hdr; > + union connector_bind_ioctl ioctl; > +}; > + > +/* > + * IOCTL to connect a connector. > + */ > + > +struct connector_connect_in { > + struct nd_connect hdr; > + u8 retry_cnt; > + u8 rnr_retry_cnt; > + u8 priv_data[56]; > + unsigned long activity_id; > +}; > + > +union connector_connect_ioctl { > + struct connector_connect_in in; > +}; > + > +struct pkt_nd_connector_connect { > + struct ndv_packet_hdr_control_1 hdr; > + union connector_connect_ioctl ioctl; > +}; > + > +/* > + * IOCTL for connector complete connect > + */ > + > +struct complete_connect_in { > + struct nd_handle hdr; > + u8 rnr_nak_to; > + unsigned long activity_id; > +}; > + > +struct complete_connect_out { > + enum ibv_qp_state state; > +}; > + > +union connector_complete_connect_ioctl { > + struct complete_connect_in in; > + struct complete_connect_out out; > +}; > + > +struct pkt_nd_connector_connect_complete { > + struct ndv_packet_hdr_control_1 hdr; > + union connector_complete_connect_ioctl ioctl; > +}; > + > + > +#define MAX_PRIVATE_DATA_LEN 148 > + > +/* > + * IOCTL for connector accept. > + */ > + > +struct connector_accept_in { > + struct nd_accept hdr; > + u8 rnr_retry_cnt; > + u8 rnr_nak_to; > + u8 private_data[MAX_PRIVATE_DATA_LEN]; > + unsigned long activity_id; > +}; > + > +struct connector_accept_out { > + enum ibv_qp_state state; > +}; > + > +union connector_accept_ioctl { > + struct connector_accept_in in; > + struct connector_accept_out out; > +}; > + > +struct pkt_nd_connector_accept { > + struct ndv_packet_hdr_control_1 hdr; > + union connector_accept_ioctl ioctl; > +}; > + > +/* > + * IOCTL for connector to reject a connection. > + */ > + > +struct connector_reject_in { > + struct nd_reject hdr; > + u8 private_data[MAX_PRIVATE_DATA_LEN]; > +}; > + > +struct connector_reject_out { > + enum ibv_qp_state state; > +}; > + > +union connector_reject_ioctl { > + struct connector_reject_in in; > + struct connector_reject_out out; > +}; > + > +struct pkt_nd_connector_reject { > + struct ndv_packet_hdr_control_1 hdr; > + union connector_reject_ioctl ioctl; > +}; > + > +/* > + * IOCTL to get connector read limits. > + */ > + > +struct connector_get_rd_limits_in { > + struct nd_handle in; > +}; > + > +struct connector_get_rd_limits_out { > + struct nd_read_limits out; > +}; > + > +union connector_get_rd_limits_ioctl { > + struct connector_get_rd_limits_in in; > + struct connector_get_rd_limits_out out; > +}; > + > +struct pkt_nd_connector_get_rd_limits { > + struct ndv_packet_hdr_control_1 hdr; > + union connector_get_rd_limits_ioctl ioctl; > +}; > + > +/* > + * IOCTL to get connector private data. > + */ > +union connector_get_priv_data_ioctl { > + struct nd_handle in; > + u8 out[MAX_PRIVATE_DATA_LEN]; > +}; > + > +struct pkt_nd_connector_get_priv_data { > + struct ndv_packet_hdr_control_1 hdr; > + union connector_get_priv_data_ioctl ioctl; > +}; > + > + > +/* > + * IOCTL get peer address. > + */ > + > +union connector_get_peer_addr_ioctl { > + struct nd_handle in; > + union nd_sockaddr_inet out; > +}; > + > +struct pkt_nd_connector_get_peer_addr { > + struct ndv_packet_hdr_control_1 hdr; > + union connector_get_peer_addr_ioctl ioctl; > +}; > + > +/* > + * IOCTL to get connector address. > + */ > + > +union connector_get_addr_ioctl { > + struct nd_handle in; > + union nd_sockaddr_inet out; > +}; > + > +struct pkt_nd_connector_get_addr { > + struct ndv_packet_hdr_control_1 hdr; > + union connector_get_addr_ioctl ioctl; > +}; > + > +/* > + * IOCTL for disconnect notification. > + */ > + > +union connector_notify_disconnect_ioctl { > + struct nd_handle in; > +}; > + > +struct pkt_nd_connector_notify_disconnect { > + struct ndv_packet_hdr_control_1 hdr; > + union connector_notify_disconnect_ioctl ioctl; > +}; > + > +union connector_disconnect_ioctl { > + struct nd_handle in; > +}; > + > +struct pkt_nd_connector_disconnect { > + struct ndv_packet_hdr_control_1 hdr; > + union connector_notify_disconnect_ioctl ioctl; > +}; > + > +/* > + * IOCTLs for QP operations. > + */ > + > +/* > + * Create qp IOCTL. > + */ > + > +struct qp_db_gpa { > + u32 byte_count; > + u32 byte_offset; > + u64 pfn_array[1]; > +}; > + > +struct create_qp_ext_data { > + union { > + u32 cnt; > + u64 padding; > + }; > + /* offsets are from start of extended data struct > + * and should start on 8 byte boundary > + */ > + struct pkt_fld fields[MLX4_IB_CREATE_QP_MAPPINGS_MAX]; > + struct qp_db_gpa db_gpa; > + struct gpa_range qpbuf_gpa; > +}; > + > +union create_qp_ioctl { > + struct nd_create_qp input; > + struct nd_resource_descriptor resrc_desc; > +}; > + > +union create_qp_mappings { > + struct ibv_create_qp qp_in; > + struct ibv_create_qp_resp qp_resp; > +}; > + > +struct pkt_nd_create_qp { > + struct ndv_packet_hdr_control_1 hdr; > + > + union create_qp_ioctl ioctl; > + union create_qp_mappings mappings; > + > + /* > + * Extended data. > + */ > + struct create_qp_ext_data ext_data; > +}; > + > +/* > + * IOCTL to flush a QP. > + */ > +struct flush_qp_ioctl { > + struct nd_handle in; > + enum ibv_qp_state out; > +}; > + > +struct pkt_nd_flush_qp { > + struct ndv_packet_hdr_control_1 hdr; > + struct flush_qp_ioctl ioctl; > +}; > + > +/* > + * Memory Region IOCTLS > + */ > +union create_mr_ioctl { > + struct nd_handle in; > + u64 out; > +}; > + > +struct pkt_nd_create_mr { > + struct ndv_packet_hdr_control_1 hdr; > + union create_mr_ioctl ioctl; > +}; > + > +struct mr_out { > + u32 lkey; > + u32 rkey; > + unsigned long activity_id; > +}; > + > + > +union register_mr_ioctl { > + struct nd_mr_register in; > + struct mr_out out; > +}; > + > +struct pkt_nd_register_mr { > + struct ndv_packet_hdr_control_1 hdr; > + union register_mr_ioctl ioctl; > +}; > + > +struct deregister_mr_ioctl { > + struct nd_handle in; > +}; > + > +struct pkt_nd_deregister_mr { > + struct ndv_packet_hdr_control_1 hdr; > + struct deregister_mr_ioctl ioctl; > +}; > + > +/* > + * IOCTL to disconnect connector > + */ > + > +/* > + * Create PD IOCTL. > + */ > +struct nd_create_pd_ioctl { > + union { > + struct nd_handle in; > + u64 out_handle; > + }; > + struct ibv_alloc_pd_resp resp; > +}; > + > +struct pkt_nd_pd_create { > + struct ndv_packet_hdr_control_1 hdr; > + struct nd_create_pd_ioctl ioctl; > +}; > + > +/* > + * Free Handle. Check the layout with Luke. > + * > + */ > +struct free_handle_ioctl { > + struct nd_handle in; > +}; > + > +struct pkt_nd_free_handle { > + struct ndv_packet_hdr_control_1 hdr; > + struct free_handle_ioctl ioctl; > +}; > + > +/* > + * Cancel I/O. > + */ > + > +struct cancel_io_ioctl { > + struct nd_handle in; > +}; > + > +struct pkt_nd_cancel_io { > + struct ndv_packet_hdr_control_1 hdr; > + struct cancel_io_ioctl ioctl; > +}; > + > +/* > + * Connector states: > + */ > + > +enum connector_state { > + HVND_CON_INCOMING, > + HVND_CON_INCOMING_ESTABLISHED, > + HVND_CON_INCOMING_REJECTED, > + HVND_CON_OUTGOING_REQUEST > +}; > + > + > +/* > + * Adaptor query IOCTL. > + */ > +struct nd_adap_query_ioctl { > + union { > + struct nd_adapter_query ad_q; > + struct adapter_info_v2 ad_info; > + }; > +}; > + > +struct pkt_nd_query_adaptor { > + struct ndv_packet_hdr_control_1 hdr; > + struct nd_adap_query_ioctl ioctl; > +}; > + > +struct nd_ioctl { > + union { > + struct nd_handle handle; > + u8 raw_buffer[NDV_MAX_IOCTL_BUFFER_SIZE]; > + }; > +}; > + > +struct pkt_nd_provider_ioctl { > + struct ndv_packet_hdr_control_1 hdr; > + struct nd_ioctl ioctl; > +}; > + > +struct hvnd_ib_pd { > + struct ib_pd ibpd; > + u32 pdn; > + u64 handle; > +}; > + > +struct hvnd_work { > + struct work_struct work; > + void *callback_arg; > +}; > + > +struct hvnd_disconnect_work { > + struct work_struct work; > + int status; > + void *callback_arg; > +}; > + > +/* > +struct hvnd_delayed_work { > + struct delayed_work work; > + void *callback_arg; > +}; > +*/ > + > +enum hvnd_cm_state { > + hvnd_cm_idle = 0, > + hvnd_cm_connect_reply_sent, //active > + hvnd_cm_connect_reply_refused, > + hvnd_cm_connect_received, //active > + hvnd_cm_connect_request_sent, //passive > + hvnd_cm_accept_sent, > + hvnd_cm_close_sent, > + hvnd_cm_established_sent, > +}; > + > +struct incoming_pkt { > + struct list_head list_entry; > + char pkt[0]; > +}; > + > +struct hvnd_ep_obj { > +/* > + spinlock_t ep_lk; > + bool to_be_destroyed; > + bool io_outstanding; > + > + wait_queue_head_t wait; > + bool stopped; > + atomic_t process_refcnt; // how many NDV_PKT_ID1_COMPLETE packets we are currently processing > +*/ > + bool stopping; > + wait_queue_head_t wait_pending; > + atomic_t nr_requests_pending; > + > + enum nd_resource_type type; > + enum connector_state state; //KYS need to look at locking > + struct iw_cm_id *cm_id; > + enum hvnd_cm_state cm_state; > + struct completion block_event; > + struct completion disconnect_event; > + struct completion connector_accept_event; > + int connector_accept_status; > + u64 ep_handle; > + spinlock_t incoming_pkt_list_lock; > + struct list_head incoming_pkt_list; > + struct hvnd_ep_obj *parent; > + struct hvnd_dev *nd_dev; > + struct hvnd_ucontext *uctx; > + struct hvnd_work wrk; > + struct hvnd_cq *cq; > + u8 ord; > + u8 ird; > + char priv_data[MAX_PRIVATE_DATA_LEN]; > + bool incoming; > + atomic_t disconnect_notified; > + u64 outstanding_handle; > + u32 local_irp; > + struct hvnd_ep_obj *outstanding_ep; > + struct pkt_nd_connector_connect connector_connect_pkt; > + int connector_connect_retry; > +}; > + > +struct hvnd_ucontext { > + struct ib_ucontext ibucontext; > + struct list_head listentry; > + struct ndv_pkt_hdr_create_1 create_pkt; > + struct ndv_pkt_hdr_create_1 create_pkt_ovl; /* Overlap handle */ > + struct pkt_nd_provider_ioctl pr_init_pkt; > + union ndv_context_handle file_handle; > + union ndv_context_handle file_handle_ovl; > + > + struct pkt_nd_open_adapter o_adap_pkt; > + > + u64 adaptor_hdl; > + > + /* > + * Protection domain state. > + */ > + struct pkt_nd_pd_create pd_cr_pkt; > + > + u64 uar_base; > + u64 bf_base; > + u32 bf_buf_size; > + u32 bf_offset; > + u32 cqe_size; > + u32 max_qp_wr; > + u32 max_sge; > + u32 max_cqe; > + u32 num_qps; > + > + /* > + * State to manage dorbell pages: > + */ > + struct list_head db_page_list; > + struct mutex db_page_mutex; > + > + atomic_t refcnt; > + > +}; > + > +struct hvnd_dev { > + struct ib_device ibdev; > + struct hv_device *hvdev; > + u32 device_cap_flags; > + unsigned char nports; > + bool ib_active; > + > + /* State to manage interaction with the host. > + */ > + > + spinlock_t uctxt_lk; > + struct list_head listentry; > + > + unsigned long mmio_sz; > + unsigned long mmio_start_addr; > + struct resource mmio_resource; > + void *mmio_virt; > + > + unsigned long negotiated_version; > + union ndv_packet_init init_pkt; > + struct ndv_pkt_hdr_init_resources_1 resources; > + struct ndv_pkt_hdr_bind_1 bind_pkt; > + > + struct ndv_pkt_hdr_create_1 global_create_pkt; > + union ndv_context_handle global_file_handle; > + > + struct semaphore query_pkt_sem; > + bool query_pkt_set; > + struct pkt_nd_query_adaptor query_pkt; > + > + /* > + * ID tables. > + */ > + spinlock_t id_lock; > + > + struct idr cqidr; > + struct idr qpidr; > + struct idr mmidr; > + struct idr irpidr; > + struct idr uctxidr; > + atomic_t open_cnt; > + > + char ip_addr[4]; > + char mac_addr[6]; > + struct completion addr_set; > + int bind_complete; > + struct mutex bind_mutex; > +}; > + > +struct hvnd_cq { > + struct ib_cq ibcq; > + void *cq_buf; > + void *db_addr; > + u32 arm_sn; > + u32 entries; > + > + u32 cqn; > + u32 cqe; > + u64 cq_handle; > + > + struct ib_umem *umem; > + struct ib_umem *db_umem; > + struct mlx4_ib_user_db_page user_db_page; > + struct hvnd_ucontext *uctx; > + struct hvnd_ep_obj ep_object; //KYS need to clean this up; have a cq irp state > + bool monitor; > + bool upcall_pending; > +}; > + > +struct hvnd_qp { > + struct ib_qp ibqp; > + void *qp_buf; > + void *db_addr; > + u32 buf_size; > + u8 port; > + struct hvnd_dev *nd_dev; > + > + __u8 log_sq_bb_count; > + __u8 log_sq_stride; > + __u8 sq_no_prefetch; > + > + int rq_wqe_cnt; > + int rq_wqe_shift; > + int rq_max_gs; > + > + int sq_wqe_cnt; > + int sq_wqe_shift; > + int sq_max_gs; > + > + u32 max_inline_data; > + > + u32 initiator_q_depth; > + u32 initiator_request_sge; > + > + u32 receive_q_depth; > + u32 receive_request_sge; > + > + struct hvnd_cq *recv_cq; > + struct hvnd_cq *send_cq; > + > + u64 receive_cq_handle; > + u64 initiator_cq_handle; > + u64 pd_handle; > + > + u64 qp_handle; > + u32 qpn; > + u32 max_send_wr; > + u32 max_recv_wr; > + u32 max_send_sge; > + u32 max_recv_sge; > + > + struct ib_umem *umem; > + struct ib_umem *db_umem; > + struct mlx4_ib_user_db_page user_db_page; > + struct hvnd_ucontext *uctx; > + struct iw_cm_id *cm_id; > + > + /* > + * Current QP state; need to look at locking. > + * XXXKYS > + */ > + enum ib_qp_state qp_state; > + bool cq_notify; > + wait_queue_head_t wait; > + atomic_t refcnt; > + struct hvnd_ep_obj *connector; > +}; > + > +struct hvnd_mr { > + struct ib_mr ibmr; > + struct hvnd_ib_pd *pd; > + struct ib_umem *umem; > + u64 start; > + u64 length; > + u64 virt; > + int acc; > + u64 mr_handle; > + u32 mr_lkey; > + u32 mr_rkey; > +}; > + > +struct hvnd_cookie { > + struct completion host_event; > + void *pkt; > +}; > + > +/* > + * Definitions to retrieve the IP address. > + */ > + > +#define HVND_CURRENT_VERSION 0 > + > +struct hvnd_ipaddr_tuple { > + char mac_address[ETH_ALEN]; > + struct sockaddr addr; > +}; > + > +struct hvnd_msg { > + int status; > + struct hvnd_ipaddr_tuple ip_tuple; > +}; > + > +static inline struct hvnd_ib_pd *to_nd_pd(struct ib_pd *pd) > +{ > + return container_of(pd, struct hvnd_ib_pd, ibpd); > +} > + > +static inline struct hvnd_dev *to_nd_dev(struct ib_device *ibdev) > +{ > + return container_of(ibdev, struct hvnd_dev, ibdev); > +} > + > +static inline struct hvnd_cq *to_nd_cq(struct ib_cq *ibcq) > +{ > + return container_of(ibcq, struct hvnd_cq, ibcq); > +} > + > +static inline struct hvnd_qp *to_nd_qp(struct ib_qp *ibqp) > +{ > + return container_of(ibqp, struct hvnd_qp, ibqp); > +} > + > +static inline struct hvnd_ucontext *to_nd_context(struct ib_ucontext *ibucontext) > +{ > + return container_of(ibucontext, struct hvnd_ucontext, ibucontext); > +} > + > +static inline struct hvnd_ucontext *get_uctx_from_pd(struct ib_pd *pd) > +{ > + return to_nd_context(pd->uobject->context); > +} > + > +static inline struct hvnd_mr *to_nd_mr(struct ib_mr *ibmr) > +{ > + return container_of(ibmr, struct hvnd_mr, ibmr); > +} > +/* > + * ID management. > + */ > + > +static inline int insert_handle(struct hvnd_dev *dev, struct idr *idr, > + void *handle, u32 id) > +{ > + int ret; > + unsigned long flags; > + > + idr_preload(GFP_KERNEL); > + spin_lock_irqsave(&dev->id_lock, flags); > + > + ret = idr_alloc(idr, handle, id, id + 1, GFP_ATOMIC); > + > + spin_unlock_irqrestore(&dev->id_lock, flags); > + idr_preload_end(); > + > + BUG_ON(ret == -ENOSPC); > + return ret < 0 ? ret : 0; > +} > + > +static inline void remove_handle(struct hvnd_dev *dev, struct idr *idr, u32 id) > +{ > + unsigned long flags; > + > + spin_lock_irqsave(&dev->id_lock, flags); > + idr_remove(idr, id); > + spin_unlock_irqrestore(&dev->id_lock, flags); > +} > + > +static inline struct hvnd_cq *get_cqp(struct hvnd_dev *dev, u32 cqid) > +{ > + struct hvnd_cq *cqp; > + unsigned long flags; > + > + spin_lock_irqsave(&dev->id_lock, flags); > + cqp = idr_find(&dev->cqidr, cqid); > + spin_unlock_irqrestore(&dev->id_lock, flags); > + > + return cqp; > +} > + > +static inline struct hvnd_qp *get_qpp(struct hvnd_dev *dev, u32 qpid) > +{ > + struct hvnd_qp *qpp; > + unsigned long flags; > + > + spin_lock_irqsave(&dev->id_lock, flags); > + qpp = idr_find(&dev->qpidr, qpid); > + spin_unlock_irqrestore(&dev->id_lock, flags); > + > + return qpp; > +} > + > +static inline struct hvnd_ucontext *get_uctx(struct hvnd_dev *dev, u32 pid) > +{ > + struct hvnd_ucontext *uctx; > + unsigned long flags; > + > + spin_lock_irqsave(&dev->id_lock, flags); > + uctx = idr_find(&dev->uctxidr, pid); > + spin_unlock_irqrestore(&dev->id_lock, flags); > + > + return uctx; > +} > + > + > +static inline void *map_irp_to_ctx(struct hvnd_dev *nd_dev, u32 irp) > +{ > + void *ctx; > + unsigned long flags; > + > + spin_lock_irqsave(&nd_dev->id_lock, flags); > + ctx = idr_find(&nd_dev->irpidr, irp); > + spin_unlock_irqrestore(&nd_dev->id_lock, flags); > + > + return ctx; > +} > + > + > + > +void hvnd_callback(void *context); > +int hvnd_negotiate_version(struct hvnd_dev *nd_dev); > +int hvnd_init_resources(struct hvnd_dev *nd_dev); > +int hvnd_bind_nic(struct hvnd_dev *nd_dev, bool un_bind, char *ip_addr, char *mac_addr); > +int hvnd_open_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx); > +int hvnd_close_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx); > +int hvnd_query_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx); > +int hvnd_create_pd(struct hvnd_ucontext *uctx, struct hvnd_dev *nd_dev, > + struct hvnd_ib_pd *hvnd_pd); > + > +/* > + * CQ operations. > + */ > +int hvnd_create_cq(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + struct hvnd_cq *cq); > +int hvnd_destroy_cq(struct hvnd_dev *nd_dev, struct hvnd_cq *cq); > +int hvnd_notify_cq(struct hvnd_dev *nd_dev, struct hvnd_cq *cq, > + u32 notify_type, u64 irp_handle); > + > +/* > + * QP operations. > + */ > +int hvnd_create_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + struct hvnd_qp *qp); > + > +int hvnd_free_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + struct hvnd_qp *qp); > + > +int hvnd_flush_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + struct hvnd_qp *qp); > + > +/* > + * MR operations. > + */ > + > +int hvnd_cr_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 pd_handle, u64 *mr_handle); > + > +int hvnd_free_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 handle); > + > +int hvnd_mr_register(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + struct hvnd_mr *mr); > +int hvnd_deregister_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 handle); > + > +/* > + * Listner operations > + */ > +int hvnd_cr_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, u64 *handle); > + > + > +int hvnd_free_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 listener_handle); > + > +int hvnd_bind_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 listener_handle, union nd_sockaddr_inet *addr); > + > +int hvnd_listen_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 listener_handle, u32 backlog); > + > +int hvnd_get_addr_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 listener_handle, union nd_sockaddr_inet *addr); > + > +int hvnd_get_connection_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 listener_handle, u64 connector_handle, > + u64 irp_handle); > + > +/* > + * Connector operations. > + */ > +int hvnd_cr_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 *connector_handle); > + > +int hvnd_free_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 handle); > + > +int hvnd_cancelio_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 handle); > +int hvnd_bind_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 handle, union nd_sockaddr_inet *addr); > + > +int hvnd_connector_connect(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 connector_handle, u32 in_rd_limit, u32 out_rd_limit, > + u32 priv_data_length, const u8 *priv_data, > + u64 qp_handle, struct if_physical_addr *phys_addr, > + union nd_sockaddr_inet *dest_addr, struct hvnd_ep_obj *ep); > + > +int hvnd_connector_complete_connect(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 connector_handle, enum ibv_qp_state *qp_state); > + > +int hvnd_connector_accept(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 connector_handle, > + u64 qp_handle, > + u32 in_rd_limit, u32 out_rd_limit, > + u32 priv_data_length, const u8 *priv_data, > + enum ibv_qp_state *qp_state, struct hvnd_ep_obj *ep); > + > +int hvnd_connector_reject(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 connector_handle, > + u32 priv_data_length, u8 *priv_data, > + enum ibv_qp_state *qp_state); > + > +int hvnd_connector_get_rd_limits(struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx, > + u64 connector_handle, > + struct nd_read_limits *rd_limits); > + > +int hvnd_connector_get_priv_data(struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx, > + u64 connector_handle, > + u8 *priv_data); > + > +int hvnd_connector_get_peer_addr(struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx, > + u64 connector_handle, > + union nd_sockaddr_inet *peer_addr); > + > +int hvnd_connector_get_local_addr(struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx, > + u64 connector_handle, > + union nd_sockaddr_inet *local_addr); > + > +int hvnd_connector_notify_disconnect(struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx, > + u64 connector_handle, struct hvnd_ep_obj *ep); > + > + > +int hvnd_connector_disconnect(struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx, > + u64 connector_handle, struct hvnd_ep_obj *ep); > + > +int hvnd_free_handle(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 handle, u32 ioctl); > + > +int hvnd_cancel_io(struct hvnd_ep_obj *ep_object); > + > +char *hvnd_get_op_name(int ioctl); > + > +void hvnd_acquire_uctx_ref(struct hvnd_ucontext *uctx); > +void hvnd_drop_uctx_ref(struct hvnd_dev *nd_dev,struct hvnd_ucontext *uctx); > +void hvnd_process_events(struct work_struct *work); > + > +void hvnd_process_cq_event_pending(struct hvnd_ep_obj *ep, int status); > +void hvnd_process_cq_event_complete(struct hvnd_ep_obj *ep, int status); > +void hvnd_process_connector_accept(struct hvnd_ep_obj *ep_object, int status); > +void hvnd_process_notify_disconnect(struct hvnd_ep_obj *ep_object, int status); > +void hvnd_process_disconnect(struct hvnd_ep_obj *ep_object, int status); > + > +void put_irp_handle(struct hvnd_dev *nd_dev, u32 irp); > +int get_irp_handle(struct hvnd_dev *nd_dev, u32 *local, void *irp_ctx); > + > +void hvnd_init_hdr(struct ndv_packet_hdr_control_1 *hdr, > + u32 data_sz, u32 local, u32 remote, > + u32 ioctl_code, > + u32 ext_data_sz, u32 ext_data_offset, > + u64 irp_handle); > + > +int hvnd_send_ioctl_pkt(struct hvnd_dev *nd_dev, > + struct ndv_packet_hdr_control_1 *hdr, > + u32 pkt_size, u64 cookie); > + > +int hvnd_get_outgoing_rdma_addr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + union nd_sockaddr_inet *og_addr); > + > +int hvnd_get_neigh_mac_addr(struct sockaddr *local, struct sockaddr *remote, char *mac_addr); > + > +void hvnd_addr_init(void); > + > +void hvnd_addr_deinit(void); > + > +bool ep_add_work_pending(struct hvnd_ep_obj *ep_object); > +void ep_del_work_pending(struct hvnd_ep_obj *ep_object); > +void ep_stop(struct hvnd_ep_obj *ep_object); > + > +#define current_pid() (current->pid) > +/* > + * NT STATUS defines. > + */ > + > +#define STATUS_SUCCESS 0x0 > +#define STATUS_PENDING 0x00000103 > +#define STATUS_CANCELLED 0xC0000120 > +#define STATUS_DISCONNECTED 0xC000020C > +#define STATUS_TIMEOUT 0xC00000B5 > + > +void inc_ioctl_counter_request(unsigned ioctl); > +void inc_ioctl_counter_response(unsigned ioctl); > + > +#define NDV_PROTOCOL_VAERSION_INVALID -1 > +#define NDV_PACKET_INIT_SIZE 16 /* Size of the INIT packet */ > + > +#define HVND_RING_SZ (PAGE_SIZE * 64) > + > +/* logging levels */ > +#define HVND_ERROR 0 > +#define HVND_WARN 1 > +#define HVND_INFO 2 > +#define HVND_DEBUG 3 > + > +extern int hvnd_log_level; > + > +#define hvnd_error(fmt, args...) hvnd_log(HVND_ERROR, fmt, ##args) > +#define hvnd_warn(fmt, args...) hvnd_log(HVND_WARN, fmt, ##args) > +#define hvnd_info(fmt, args...) hvnd_log(HVND_INFO, fmt, ##args) > +#define hvnd_debug(fmt, args...) hvnd_log(HVND_DEBUG, fmt, ##args) > + > +#define hvnd_log(level, fmt, args...) \ > +do { \ > + if (unlikely(hvnd_log_level >= (level))) \ > + printk(KERN_ERR "hvnd %s[%u]: " fmt, __func__, __LINE__, ##args); \ > +} while (0) > + > +#endif /* _VMBUS_RDMA_H */ > diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/COPYING b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/COPYING > deleted file mode 100644 > index f0960c946b07..000000000000 > --- a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/COPYING > +++ /dev/null > @@ -1,339 +0,0 @@ > - > -"This software program is licensed subject to the GNU General Public License > -(GPL). Version 2, June 1991, available at > -<http://www.fsf.org/copyleft/gpl.html>" > - > -GNU General Public License > - > -Version 2, June 1991 > - > -Copyright (C) 1989, 1991 Free Software Foundation, Inc. > -51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA > - > -Everyone is permitted to copy and distribute verbatim copies of this license > -document, but changing it is not allowed. > - > -Preamble > - > -The licenses for most software are designed to take away your freedom to > -share and change it. By contrast, the GNU General Public License is intended > -to guarantee your freedom to share and change free software--to make sure > -the software is free for all its users. This General Public License applies > -to most of the Free Software Foundation's software and to any other program > -whose authors commit to using it. (Some other Free Software Foundation > -software is covered by the GNU Library General Public License instead.) You > -can apply it to your programs, too. > - > -When we speak of free software, we are referring to freedom, not price. Our > -General Public Licenses are designed to make sure that you have the freedom > -to distribute copies of free software (and charge for this service if you > -wish), that you receive source code or can get it if you want it, that you > -can change the software or use pieces of it in new free programs; and that > -you know you can do these things. > - > -To protect your rights, we need to make restrictions that forbid anyone to > -deny you these rights or to ask you to surrender the rights. These > -restrictions translate to certain responsibilities for you if you distribute > -copies of the software, or if you modify it. > - > -For example, if you distribute copies of such a program, whether gratis or > -for a fee, you must give the recipients all the rights that you have. You > -must make sure that they, too, receive or can get the source code. And you > -must show them these terms so they know their rights. > - > -We protect your rights with two steps: (1) copyright the software, and (2) > -offer you this license which gives you legal permission to copy, distribute > -and/or modify the software. > - > -Also, for each author's protection and ours, we want to make certain that > -everyone understands that there is no warranty for this free software. If > -the software is modified by someone else and passed on, we want its > -recipients to know that what they have is not the original, so that any > -problems introduced by others will not reflect on the original authors' > -reputations. > - > -Finally, any free program is threatened constantly by software patents. We > -wish to avoid the danger that redistributors of a free program will > -individually obtain patent licenses, in effect making the program > -proprietary. To prevent this, we have made it clear that any patent must be > -licensed for everyone's free use or not licensed at all. > - > -The precise terms and conditions for copying, distribution and modification > -follow. > - > -TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION > - > -0. This License applies to any program or other work which contains a notice > - placed by the copyright holder saying it may be distributed under the > - terms of this General Public License. The "Program", below, refers to any > - such program or work, and a "work based on the Program" means either the > - Program or any derivative work under copyright law: that is to say, a > - work containing the Program or a portion of it, either verbatim or with > - modifications and/or translated into another language. (Hereinafter, > - translation is included without limitation in the term "modification".) > - Each licensee is addressed as "you". > - > - Activities other than copying, distribution and modification are not > - covered by this License; they are outside its scope. The act of running > - the Program is not restricted, and the output from the Program is covered > - only if its contents constitute a work based on the Program (independent > - of having been made by running the Program). Whether that is true depends > - on what the Program does. > - > -1. You may copy and distribute verbatim copies of the Program's source code > - as you receive it, in any medium, provided that you conspicuously and > - appropriately publish on each copy an appropriate copyright notice and > - disclaimer of warranty; keep intact all the notices that refer to this > - License and to the absence of any warranty; and give any other recipients > - of the Program a copy of this License along with the Program. > - > - You may charge a fee for the physical act of transferring a copy, and you > - may at your option offer warranty protection in exchange for a fee. > - > -2. You may modify your copy or copies of the Program or any portion of it, > - thus forming a work based on the Program, and copy and distribute such > - modifications or work under the terms of Section 1 above, provided that > - you also meet all of these conditions: > - > - * a) You must cause the modified files to carry prominent notices stating > - that you changed the files and the date of any change. > - > - * b) You must cause any work that you distribute or publish, that in > - whole or in part contains or is derived from the Program or any part > - thereof, to be licensed as a whole at no charge to all third parties > - under the terms of this License. > - > - * c) If the modified program normally reads commands interactively when > - run, you must cause it, when started running for such interactive > - use in the most ordinary way, to print or display an announcement > - including an appropriate copyright notice and a notice that there is > - no warranty (or else, saying that you provide a warranty) and that > - users may redistribute the program under these conditions, and > - telling the user how to view a copy of this License. (Exception: if > - the Program itself is interactive but does not normally print such > - an announcement, your work based on the Program is not required to > - print an announcement.) > - > - These requirements apply to the modified work as a whole. If identifiable > - sections of that work are not derived from the Program, and can be > - reasonably considered independent and separate works in themselves, then > - this License, and its terms, do not apply to those sections when you > - distribute them as separate works. But when you distribute the same > - sections as part of a whole which is a work based on the Program, the > - distribution of the whole must be on the terms of this License, whose > - permissions for other licensees extend to the entire whole, and thus to > - each and every part regardless of who wrote it. > - > - Thus, it is not the intent of this section to claim rights or contest > - your rights to work written entirely by you; rather, the intent is to > - exercise the right to control the distribution of derivative or > - collective works based on the Program. > - > - In addition, mere aggregation of another work not based on the Program > - with the Program (or with a work based on the Program) on a volume of a > - storage or distribution medium does not bring the other work under the > - scope of this License. > - > -3. You may copy and distribute the Program (or a work based on it, under > - Section 2) in object code or executable form under the terms of Sections > - 1 and 2 above provided that you also do one of the following: > - > - * a) Accompany it with the complete corresponding machine-readable source > - code, which must be distributed under the terms of Sections 1 and 2 > - above on a medium customarily used for software interchange; or, > - > - * b) Accompany it with a written offer, valid for at least three years, > - to give any third party, for a charge no more than your cost of > - physically performing source distribution, a complete machine- > - readable copy of the corresponding source code, to be distributed > - under the terms of Sections 1 and 2 above on a medium customarily > - used for software interchange; or, > - > - * c) Accompany it with the information you received as to the offer to > - distribute corresponding source code. (This alternative is allowed > - only for noncommercial distribution and only if you received the > - program in object code or executable form with such an offer, in > - accord with Subsection b above.) > - > - The source code for a work means the preferred form of the work for > - making modifications to it. For an executable work, complete source code > - means all the source code for all modules it contains, plus any > - associated interface definition files, plus the scripts used to control > - compilation and installation of the executable. However, as a special > - exception, the source code distributed need not include anything that is > - normally distributed (in either source or binary form) with the major > - components (compiler, kernel, and so on) of the operating system on which > - the executable runs, unless that component itself accompanies the > - executable. > - > - If distribution of executable or object code is made by offering access > - to copy from a designated place, then offering equivalent access to copy > - the source code from the same place counts as distribution of the source > - code, even though third parties are not compelled to copy the source > - along with the object code. > - > -4. You may not copy, modify, sublicense, or distribute the Program except as > - expressly provided under this License. Any attempt otherwise to copy, > - modify, sublicense or distribute the Program is void, and will > - automatically terminate your rights under this License. However, parties > - who have received copies, or rights, from you under this License will not > - have their licenses terminated so long as such parties remain in full > - compliance. > - > -5. You are not required to accept this License, since you have not signed > - it. However, nothing else grants you permission to modify or distribute > - the Program or its derivative works. These actions are prohibited by law > - if you do not accept this License. Therefore, by modifying or > - distributing the Program (or any work based on the Program), you > - indicate your acceptance of this License to do so, and all its terms and > - conditions for copying, distributing or modifying the Program or works > - based on it. > - > -6. Each time you redistribute the Program (or any work based on the > - Program), the recipient automatically receives a license from the > - original licensor to copy, distribute or modify the Program subject to > - these terms and conditions. You may not impose any further restrictions > - on the recipients' exercise of the rights granted herein. You are not > - responsible for enforcing compliance by third parties to this License. > - > -7. If, as a consequence of a court judgment or allegation of patent > - infringement or for any other reason (not limited to patent issues), > - conditions are imposed on you (whether by court order, agreement or > - otherwise) that contradict the conditions of this License, they do not > - excuse you from the conditions of this License. If you cannot distribute > - so as to satisfy simultaneously your obligations under this License and > - any other pertinent obligations, then as a consequence you may not > - distribute the Program at all. For example, if a patent license would > - not permit royalty-free redistribution of the Program by all those who > - receive copies directly or indirectly through you, then the only way you > - could satisfy both it and this License would be to refrain entirely from > - distribution of the Program. > - > - If any portion of this section is held invalid or unenforceable under any > - particular circumstance, the balance of the section is intended to apply > - and the section as a whole is intended to apply in other circumstances. > - > - It is not the purpose of this section to induce you to infringe any > - patents or other property right claims or to contest validity of any > - such claims; this section has the sole purpose of protecting the > - integrity of the free software distribution system, which is implemented > - by public license practices. Many people have made generous contributions > - to the wide range of software distributed through that system in > - reliance on consistent application of that system; it is up to the > - author/donor to decide if he or she is willing to distribute software > - through any other system and a licensee cannot impose that choice. > - > - This section is intended to make thoroughly clear what is believed to be > - a consequence of the rest of this License. > - > -8. If the distribution and/or use of the Program is restricted in certain > - countries either by patents or by copyrighted interfaces, the original > - copyright holder who places the Program under this License may add an > - explicit geographical distribution limitation excluding those countries, > - so that distribution is permitted only in or among countries not thus > - excluded. In such case, this License incorporates the limitation as if > - written in the body of this License. > - > -9. The Free Software Foundation may publish revised and/or new versions of > - the General Public License from time to time. Such new versions will be > - similar in spirit to the present version, but may differ in detail to > - address new problems or concerns. > - > - Each version is given a distinguishing version number. If the Program > - specifies a version number of this License which applies to it and "any > - later version", you have the option of following the terms and > - conditions either of that version or of any later version published by > - the Free Software Foundation. If the Program does not specify a version > - number of this License, you may choose any version ever published by the > - Free Software Foundation. > - > -10. If you wish to incorporate parts of the Program into other free programs > - whose distribution conditions are different, write to the author to ask > - for permission. For software which is copyrighted by the Free Software > - Foundation, write to the Free Software Foundation; we sometimes make > - exceptions for this. Our decision will be guided by the two goals of > - preserving the free status of all derivatives of our free software and > - of promoting the sharing and reuse of software generally. > - > - NO WARRANTY > - > -11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY > - FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN > - OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES > - PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER > - EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED > - WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE > - ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH > - YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL > - NECESSARY SERVICING, REPAIR OR CORRECTION. > - > -12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING > - WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR > - REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR > - DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL > - DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM > - (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED > - INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF > - THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR > - OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. > - > -END OF TERMS AND CONDITIONS > - > -How to Apply These Terms to Your New Programs > - > -If you develop a new program, and you want it to be of the greatest > -possible use to the public, the best way to achieve this is to make it free > -software which everyone can redistribute and change under these terms. > - > -To do so, attach the following notices to the program. It is safest to > -attach them to the start of each source file to most effectively convey the > -exclusion of warranty; and each file should have at least the "copyright" > -line and a pointer to where the full notice is found. > - > -one line to give the program's name and an idea of what it does. > -Copyright (C) yyyy name of author > - > -This program is free software; you can redistribute it and/or modify it > -under the terms of the GNU General Public License as published by the Free > -Software Foundation; either version 2 of the License, or (at your option) > -any later version. > - > -This program is distributed in the hope that it will be useful, but WITHOUT > -ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or > -FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for > -more details. > - > -You should have received a copy of the GNU General Public License along with > -this program; if not, write to the Free Software Foundation, Inc., 59 > -Temple Place - Suite 330, Boston, MA 02111-1307, USA. > - > -Also add information on how to contact you by electronic and paper mail. > - > -If the program is interactive, make it output a short notice like this when > -it starts in an interactive mode: > - > -Gnomovision version 69, Copyright (C) year name of author Gnomovision comes > -with ABSOLUTELY NO WARRANTY; for details type 'show w'. This is free > -software, and you are welcome to redistribute it under certain conditions; > -type 'show c' for details. > - > -The hypothetical commands 'show w' and 'show c' should show the appropriate > -parts of the General Public License. Of course, the commands you use may be > -called something other than 'show w' and 'show c'; they could even be > -mouse-clicks or menu items--whatever suits your program. > - > -You should also get your employer (if you work as a programmer) or your > -school, if any, to sign a "copyright disclaimer" for the program, if > -necessary. Here is a sample; alter the names: > - > -Yoyodyne, Inc., hereby disclaims all copyright interest in the program > -'Gnomovision' (which makes passes at compilers) written by James Hacker. > - > -signature of Ty Coon, 1 April 1989 > -Ty Coon, President of Vice > - > -This General Public License does not permit incorporating your program into > -proprietary programs. If your program is a subroutine library, you may > -consider it more useful to permit linking proprietary applications with the > -library. If this is what you want to do, use the GNU Library General Public > -License instead of this License. > diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/Makefile b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/Makefile > deleted file mode 100644 > index 0373d95341c1..000000000000 > --- a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/Makefile > +++ /dev/null > @@ -1,3 +0,0 @@ > -obj-$(CONFIG_HYPERV_INFINIBAND_ND) += hv_network_direct_142.o > - > -hv_network_direct_142-y := provider.o vmbus_rdma.o hvnd_addr.o > diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/hvnd_addr.c b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/hvnd_addr.c > deleted file mode 100644 > index c8c2295b63af..000000000000 > --- a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/hvnd_addr.c > +++ /dev/null > @@ -1,136 +0,0 @@ > -/* > - * Copyright (c) 2014, Microsoft Corporation. > - * > - * Author: > - * K. Y. Srinivasan <kys@microsoft.com> > - * > - * This program is free software; you can redistribute it and/or modify it > - * under the terms of the GNU General Public License version 2 as published > - * by the Free Software Foundation. > - * > - * This program is distributed in the hope that it will be useful, but > - * WITHOUT ANY WARRANTY; without even the implied warranty of > - * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or > - * NON INFRINGEMENT. See the GNU General Public License for more > - * details. > - * > - * Bug fixes/enhancements: Long Li <longli@microsoft.com> > - */ > - > -#include <linux/completion.h> > -#include <linux/module.h> > -#include <linux/errno.h> > -#include <linux/hyperv.h> > -#include <linux/efi.h> > -#include <linux/slab.h> > -#include <linux/cred.h> > -#include <linux/sched.h> > -#include <linux/types.h> > -#include <linux/completion.h> > -#include <linux/scatterlist.h> > -#include <rdma/ib_umem.h> > -#include <rdma/ib_addr.h> > - > -#include "vmbus_rdma.h" > - > - > -#include <linux/semaphore.h> > -#include <linux/fs.h> > -#include <linux/nls.h> > -#include <linux/workqueue.h> > -#include <linux/cdev.h> > -#include <linux/hyperv.h> > -#include <linux/sched.h> > -#include <linux/uaccess.h> > -#include <linux/miscdevice.h> > -#include <linux/hyperv.h> > - > - > -int hvnd_get_outgoing_rdma_addr(struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx, > - union nd_sockaddr_inet *og_addr) > -{ > - int ret; > - /* > - * Query the host and select the first address. > - */ > - struct pkt_query_addr_list pkt; > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - > - hvnd_init_hdr(&pkt.hdr, > - (sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1)), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST, 0, 0, 0); > - > - pkt.ioctl.in.version = ND_VERSION_1; > - pkt.ioctl.in.reserved = 0; > - pkt.ioctl.in.handle = uctx->adaptor_hdl; > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, > - sizeof(pkt), (u64)&pkt); > - > - if (ret) > - return ret; > - > - /* > - * Copy the address out. > - */ > - > - memcpy(og_addr, &pkt.ioctl.out[0], sizeof(*og_addr)); > - return 0; > - > -} > - > -static struct rdma_addr_client self; > - > -struct resolve_cb_context { > - struct rdma_dev_addr *addr; > - struct completion comp; > -}; > - > -void hvnd_addr_init(void) > -{ > - rdma_addr_register_client(&self); > -} > - > -void hvnd_addr_deinit(void) > -{ > - rdma_addr_unregister_client(&self); > -} > - > -static void resolve_cb(int status, struct sockaddr *src_addr, > - struct rdma_dev_addr *addr, void *context) > -{ > - memcpy(((struct resolve_cb_context *)context)->addr, addr, sizeof(struct > - rdma_dev_addr)); > - complete(&((struct resolve_cb_context *)context)->comp); > -} > - > -int hvnd_get_neigh_mac_addr(struct sockaddr *local, struct sockaddr *remote, > - char *mac_addr) > -{ > - struct rdma_dev_addr dev_addr; > - struct resolve_cb_context ctx; > - int ret; > - > - memset(&dev_addr, 0, sizeof(dev_addr)); > - dev_addr.net = &init_net; > - ctx.addr = &dev_addr; > - init_completion(&ctx.comp); > - > - ret = rdma_resolve_ip(&self, local, remote, &dev_addr, 1000, > - resolve_cb, &ctx); > - > - if (ret) { > - hvnd_error("rdma_resolve_ip failed ret=%d\n", ret); > - return ret; > - } > - > - wait_for_completion(&ctx.comp); > - memcpy(mac_addr, dev_addr.dst_dev_addr, ETH_ALEN); > - return ret; > -} > diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/mx_abi.h b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/mx_abi.h > deleted file mode 100644 > index 8f55cb408de0..000000000000 > --- a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/mx_abi.h > +++ /dev/null > @@ -1,232 +0,0 @@ > -/* > - * Copyright (c) 2005 Topspin Communications. All rights reserved. > - * Copyright (c) 2005 Cisco Systems. All rights reserved. > - * Copyright (c) 2005 PathScale, Inc. All rights reserved. > - * > - * This software is available to you under a choice of one of two > - * licenses. You may choose to be licensed under the terms of the GNU > - * General Public License (GPL) Version 2, available from the file > - * COPYING in the main directory of this source tree, or the > - * OpenIB.org BSD license below: > - * > - * Redistribution and use in source and binary forms, with or > - * without modification, are permitted provided that the following > - * conditions are met: > - * > - * - Redistributions of source code must retain the above > - * copyright notice, this list of conditions and the following > - * disclaimer. > - * > - * - Redistributions in binary form must reproduce the above > - * copyright notice, this list of conditions and the following > - * disclaimer in the documentation and/or other materials > - * provided with the distribution. > - * > - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, > - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF > - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND > - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS > - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN > - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN > - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE > - * SOFTWARE. > - * > - * KYS: made some modifications. > - */ > - > -#ifndef MX_ABI_H > -#define MX_ABI_H > - > - > - > -/* > - * Make sure that all structs defined in this file remain laid out so > - * that they pack the same way on 32-bit and 64-bit architectures (to > - * avoid incompatibility between 32-bit userspace and 64-bit kernels). > - * Specifically: > - * - Do not use pointer types -- pass pointers in UINT64 instead. > - * - Make sure that any structure larger than 4 bytes is padded to a > - * multiple of 8 bytes. Otherwise the structure size will be > - * different between 32-bit and 64-bit architectures. > - */ > - > -enum ibv_get_context_mappings { > - IBV_GET_CONTEXT_UAR, > - IBV_GET_CONTEXT_BF, > - IBV_GET_CONTEXT_MAPPING_MAX > -}; > - > -struct ibv_get_context_req { > - > - union nd_mapping mappings[IBV_GET_CONTEXT_MAPPING_MAX]; > -}; > - > -struct ibv_get_context_resp { > - > - // mmap UAR and BF > - struct nd_mapping_result mapping_results[IBV_GET_CONTEXT_MAPPING_MAX]; > - > - // mmap Blue Flame > - int bf_buf_size; > - int bf_offset; > - > - // mlx4_query_device result > - int max_qp_wr; > - int max_sge; > - int max_cqe; > - > - // general parameters > - u32 cqe_size; > - u32 vend_id; > - u16 dev_id; > - u16 bf_reg_size; > - u16 bf_regs_per_page; > - u16 reserved1; > - > - // ibv_cmd_get_context result > - u32 qp_tab_size; > - > - u32 reserved2; > -}; > - > -struct ibv_alloc_pd_resp { > - u64 pd_handle; > - u32 pdn; > - u32 reserved; > -}; > - > -struct ibv_reg_mr { > - u64 start; > - u64 length; > - u64 hca_va; > - u32 access_flags; > - u32 pdn; > - u64 pd_handle; > -}; > - > -struct ibv_reg_mr_resp { > - u64 mr_handle; > - u32 lkey; > - u32 rkey; > -}; > - > - > -enum mlx4_ib_create_cq_mapping { > - MLX4_IB_CREATE_CQ_BUF, > - MLX4_IB_CREATE_CQ_DB, > - MLX4_IB_CREATE_CQ_ARM_SN, // Windows specific > - MLX4_IB_CREATE_CQ_MAPPING_MAX > -}; > - > -#define MLX4_CQ_FLAGS_ARM_IN_KERNEL 1 > - > -struct ibv_create_cq { > - union nd_mapping mappings[MLX4_IB_CREATE_CQ_MAPPING_MAX]; > - u32 flags; > -}; > - > -struct ibv_create_cq_resp { > - struct nd_mapping_result mapping_results[MLX4_IB_CREATE_CQ_MAPPING_MAX]; > - u32 cqn; > - u32 cqe; > -}; > - > -enum mlx4_ib_create_srq_mappings { > - MLX4_IB_CREATE_SRQ_BUF, > - MLX4_IB_CREATE_SRQ_DB, > - MLX4_IB_CREATE_SRQ_MAPPINGS_MAX > -}; > - > -struct ibv_create_srq { > - union nd_mapping mappings[MLX4_IB_CREATE_SRQ_MAPPINGS_MAX]; > -}; > - > -struct ibv_create_srq_resp { > - struct nd_mapping_result mapping_results[MLX4_IB_CREATE_SRQ_MAPPINGS_MAX]; > -}; > - > -enum mlx4_ib_create_qp_mappings { > - MLX4_IB_CREATE_QP_BUF, > - MLX4_IB_CREATE_QP_DB, > - MLX4_IB_CREATE_QP_MAPPINGS_MAX > -}; > - > -struct ibv_create_qp { > - union nd_mapping mappings[MLX4_IB_CREATE_QP_MAPPINGS_MAX]; > - u8 log_sq_bb_count; > - u8 log_sq_stride; > - u8 sq_no_prefetch; > - u8 reserved; > -}; > - > -struct ibv_create_qp_resp { > - struct nd_mapping_result mapping_results[MLX4_IB_CREATE_QP_MAPPINGS_MAX]; > - // struct ib_uverbs_create_qp_resp > - u64 qp_handle; > - u32 qpn; > - u32 max_send_wr; > - u32 max_recv_wr; > - u32 max_send_sge; > - u32 max_recv_sge; > - u32 max_inline_data; > -}; > - > -enum ibv_qp_attr_mask { > - IBV_QP_STATE = 1 << 0, > - IBV_QP_CUR_STATE = 1 << 1, > - IBV_QP_EN_SQD_ASYNC_NOTIFY = 1 << 2, > - IBV_QP_ACCESS_FLAGS = 1 << 3, > - IBV_QP_PKEY_INDEX = 1 << 4, > - IBV_QP_PORT = 1 << 5, > - IBV_QP_QKEY = 1 << 6, > - IBV_QP_AV = 1 << 7, > - IBV_QP_PATH_MTU = 1 << 8, > - IBV_QP_TIMEOUT = 1 << 9, > - IBV_QP_RETRY_CNT = 1 << 10, > - IBV_QP_RNR_RETRY = 1 << 11, > - IBV_QP_RQ_PSN = 1 << 12, > - IBV_QP_MAX_QP_RD_ATOMIC = 1 << 13, > - IBV_QP_ALT_PATH = 1 << 14, > - IBV_QP_MIN_RNR_TIMER = 1 << 15, > - IBV_QP_SQ_PSN = 1 << 16, > - IBV_QP_MAX_DEST_RD_ATOMIC = 1 << 17, > - IBV_QP_PATH_MIG_STATE = 1 << 18, > - IBV_QP_CAP = 1 << 19, > - IBV_QP_DEST_QPN = 1 << 20 > -}; > - > -enum ibv_qp_state { > - IBV_QPS_RESET, > - IBV_QPS_INIT, > - IBV_QPS_RTR, > - IBV_QPS_RTS, > - IBV_QPS_SQD, > - IBV_QPS_SQE, > - IBV_QPS_ERR > -}; > - > - > -struct ibv_modify_qp_resp { > - enum ibv_qp_attr_mask attr_mask; > - u8 qp_state; > - u8 reserved[3]; > -}; > - > -struct ibv_create_ah_resp { > - u64 start; > -}; > - > -/* > - * Some mlx4 specific kernel definitions. Perhaps could be in > - * separate file. > - */ > - > -struct mlx4_ib_user_db_page { > - struct list_head list; > - struct ib_umem *umem; > - unsigned long user_virt; > - int refcnt; > -}; > - > - > -#endif /* MX_ABI_H */ > diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/provider.c b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/provider.c > deleted file mode 100644 > index 7115cf769fe0..000000000000 > --- a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/provider.c > +++ /dev/null > @@ -1,2961 +0,0 @@ > -/* > - * Copyright (c) 2014, Microsoft Corporation. > - * > - * Author: > - * K. Y. Srinivasan <kys@microsoft.com> > - * > - * This program is free software; you can redistribute it and/or modify it > - * under the terms of the GNU General Public License version 2 as published > - * by the Free Software Foundation. > - * > - * This program is distributed in the hope that it will be useful, but > - * WITHOUT ANY WARRANTY; without even the implied warranty of > - * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or > - * NON INFRINGEMENT. See the GNU General Public License for more > - * details. > - * > - * Bug fixes/enhancements: Long Li <longli@microsoft.com> > - */ > - > -#include <linux/module.h> > -#include <linux/moduleparam.h> > -#include <linux/device.h> > -#include <linux/netdevice.h> > -#include <linux/etherdevice.h> > -#include <linux/delay.h> > -#include <linux/errno.h> > -#include <linux/list.h> > -#include <linux/spinlock.h> > -#include <linux/ethtool.h> > -#include <linux/rtnetlink.h> > -#include <linux/inetdevice.h> > -#include <linux/io.h> > -#include <linux/hyperv.h> > -#include <linux/completion.h> > -#include <rdma/iw_cm.h> > -#include <rdma/ib_verbs.h> > -#include <rdma/ib_smi.h> > -#include <rdma/ib_umem.h> > -#include <rdma/ib_user_verbs.h> > -#include <rdma/rdma_cm.h> > -#include <linux/proc_fs.h> > -#include <linux/seq_file.h> > -#include "vmbus_rdma.h" > - > -/* > - * We are emulating mlx4. XXXKYS: May have to FIX. > - */ > -#include "./user.h" > - > -static struct hvnd_dev *g_nd_dev; // the one and only one > - > -int hvnd_log_level = HVND_ERROR; > -module_param(hvnd_log_level, int, 0644); > -MODULE_PARM_DESC(hvnd_log_level, > - "Logging level, 0 - Error (default), 1 - Warning, 2 - Info, 3 - Debug."); > - > -static int disable_cq_notify = 1; > -//static int disable_cq_notify = 0; > -module_param(disable_cq_notify, int, 0644); > -MODULE_PARM_DESC(disable_cq_notify, > - "Disable CQ notification, 0 - Enable, 1 - Disable (default)."); > - > -enum { > - MLX4_USER_DEV_CAP_64B_CQE = 1L << 0 > -}; > - > -#define HVND_NODE_DESC "vmbus-RDMA" > - > -#undef MLX4_IB_UVERBS_ABI_VERSION > -#define MLX4_IB_UVERBS_ABI_VERSION 4 > - > -struct mlx4_wqe_data_seg { > - __be32 byte_count; > - __be32 lkey; > - __be64 addr; > -}; > - > -/* return value: > - * true: ep is running > - * false: ep is stopped > - */ > -bool ep_add_work_pending(struct hvnd_ep_obj *ep_object) > -{ > - bool ret = true; > - > - atomic_inc(&ep_object->nr_requests_pending); > - if (ep_object->stopping) { > - if (atomic_dec_and_test(&ep_object->nr_requests_pending)) > - wake_up(&ep_object->wait_pending); > - ret = false; > - } > - return ret; > -} > - > -void ep_del_work_pending(struct hvnd_ep_obj *ep_object) > -{ > - if (atomic_dec_and_test(&ep_object->nr_requests_pending)) > - wake_up(&ep_object->wait_pending); > - > - if (atomic_read(&ep_object->nr_requests_pending) < 0) { > - hvnd_error("ep_object->nr_requests_pending=%d type=%d cm_state=%d\n", > - atomic_read(&ep_object->nr_requests_pending), > - ep_object->type, ep_object->cm_state); > - dump_stack(); > - } > -} > - > -void ep_stop(struct hvnd_ep_obj *ep_object) > -{ > - if (!ep_object->stopping) { > - ep_object->stopping = true; > - hvnd_cancel_io(ep_object); > - } > - > - if (atomic_read(&ep_object->nr_requests_pending) < 0) { > - hvnd_error("IO canceled, ep_object->nr_requests_pending=%d type=%d cm_state=%d\n", > - atomic_read(&ep_object->nr_requests_pending), > - ep_object->type, ep_object->cm_state); > - dump_stack(); > - } > - > - wait_event(ep_object->wait_pending, > - !atomic_read(&ep_object->nr_requests_pending)); > -} > - > -static int vmbus_dma_map_sg(struct device *dev, struct scatterlist *sgl, > - int nents, enum dma_data_direction direction, > - unsigned long attrs) > -{ > - struct scatterlist *sg; > - u64 addr; > - int i; > - int ret = nents; > - > - BUG_ON(!valid_dma_direction(direction)); > - > - for_each_sg(sgl, sg, nents, i) { > - addr = (u64) page_address(sg_page(sg)); > - /* TODO: handle highmem pages */ > - if (!addr) { > - ret = 0; > - break; > - } > - sg->dma_address = addr + sg->offset; > - sg->dma_length = sg->length; > - } > - return ret; > -} > - > -static void vmbus_dma_unmap_sg(struct device *dev, > - struct scatterlist *sg, int nents, > - enum dma_data_direction direction, unsigned long attrs) > -{ > - BUG_ON(!valid_dma_direction(direction)); > -} > - > - > -const struct dma_map_ops vmbus_dma_ops = { > - .map_sg = vmbus_dma_map_sg, > - .unmap_sg = vmbus_dma_unmap_sg, > -}; > - > -static int hvnd_get_incoming_connections(struct hvnd_ep_obj *listener, > - struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx); > - > -static struct hvnd_ep_obj *hvnd_setup_ep(struct iw_cm_id *cm_id, int ep_type, > - struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx); > - > -static void hvnd_deinit_ep(struct hvnd_ep_obj *ep) > -{ > - put_irp_handle(ep->nd_dev, ep->local_irp); > -} > - > -static void hvnd_destroy_ep(struct hvnd_ep_obj *ep) > -{ > - hvnd_debug("canceling work for ep %p\n", ep); > - cancel_work_sync(&ep->wrk.work); > - hvnd_deinit_ep(ep); > - kfree(ep); > -} > - > - > -#define UC(b) (((int)b)&0xff) > -char *debug_inet_ntoa(struct in_addr in, char *b) > -{ > - register char *p; > - > - p = (char *)∈ > - (void)snprintf(b, 20, > - "%d.%d.%d.%d", UC(p[0]), UC(p[1]), UC(p[2]), UC(p[3])); > - return b; > -} > - > -void hvnd_process_events(struct work_struct *work); > - > -static int hvnd_init_ep(struct hvnd_ep_obj *ep_object, > - struct iw_cm_id *cm_id, int ep_type, > - struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx) > -{ > - int ret; > - > - ep_object->type = ep_type; > - ep_object->cm_id = cm_id; > - ep_object->nd_dev = nd_dev; > - ep_object->uctx = uctx; > - > - ep_object->parent = NULL; > - > - ep_object->wrk.callback_arg = ep_object; > - INIT_WORK(&ep_object->wrk.work, hvnd_process_events); > - INIT_LIST_HEAD(&ep_object->incoming_pkt_list); > - spin_lock_init(&ep_object->incoming_pkt_list_lock); > - > - ep_object->stopping = false; > - atomic_set(&ep_object->nr_requests_pending, 0); > - init_waitqueue_head(&ep_object->wait_pending); > - > - ret = get_irp_handle(nd_dev, &ep_object->local_irp, (void *)ep_object); > - > - if (ret) { > - hvnd_error("get_irp_handle() failed: err: %d\n", ret); > - return ret; > - } > - return 0; > - > -} > - > -static int set_rq_size(struct hvnd_dev *dev, struct ib_qp_cap *cap, > - struct hvnd_qp *qp) > -{ > - > - /* HW requires >= 1 RQ entry with >= 1 gather entry */ > - if (!cap->max_recv_wr || !cap->max_recv_sge) > - return -EINVAL; > - > - qp->rq_wqe_cnt = roundup_pow_of_two(max(1U, cap->max_recv_wr)); > - qp->rq_max_gs = roundup_pow_of_two(max(1U, cap->max_recv_sge)); > - qp->rq_wqe_shift = ilog2( > - qp->rq_max_gs * sizeof(struct mlx4_wqe_data_seg)); > - > - > - return 0; > -} > - > -static int set_user_sq_size(struct hvnd_dev *dev, > - struct hvnd_qp *qp, > - struct mlx4_ib_create_qp *ucmd) > -{ > - qp->sq_wqe_cnt = 1 << ucmd->log_sq_bb_count; > - qp->sq_wqe_shift = ucmd->log_sq_stride; > - > - qp->buf_size = (qp->rq_wqe_cnt << qp->rq_wqe_shift) + > - (qp->sq_wqe_cnt << qp->sq_wqe_shift); > - > - return 0; > -} > - > -static int hvnd_db_map_user(struct hvnd_ucontext *uctx, unsigned long virt, > - struct ib_umem **db_umem) > -{ > - struct mlx4_ib_user_db_page *page; > - int err = 0; > - > - mutex_lock(&uctx->db_page_mutex); > - > - list_for_each_entry(page, &uctx->db_page_list, list) > - if (page->user_virt == (virt & PAGE_MASK)) > - goto found; > - > - page = kmalloc(sizeof(*page), GFP_KERNEL); > - if (!page) { > - err = -ENOMEM; > - goto out; > - } > - > - page->user_virt = (virt & PAGE_MASK); > - page->refcnt = 0; > - page->umem = ib_umem_get(&uctx->ibucontext, virt & PAGE_MASK, > - PAGE_SIZE, 0, 0); > - if (IS_ERR(page->umem)) { > - hvnd_error("ib_umem_get failure\n"); > - err = PTR_ERR(page->umem); > - kfree(page); > - goto out; > - } > - > - list_add(&page->list, &uctx->db_page_list); > - > -found: > - ++page->refcnt; > -out: > - mutex_unlock(&uctx->db_page_mutex); > - if (!err) > - *db_umem = page->umem; > - > - return err; > -} > - > -static void hvnd_db_unmap_user(struct hvnd_ucontext *uctx, u64 db_addr) > -{ > - struct mlx4_ib_user_db_page *page; > - > - mutex_lock(&uctx->db_page_mutex); > - list_for_each_entry(page, &uctx->db_page_list, list) > - if (page->user_virt == (db_addr & PAGE_MASK)) > - goto found; > - > -found: > - if (!--page->refcnt) { > - list_del(&page->list); > - ib_umem_release(page->umem); > - kfree(page); > - } > - > - mutex_unlock(&uctx->db_page_mutex); > -} > - > - > -static void debug_check(const char *func, int line) > -{ > - hvnd_debug("func is: %s; line is %d\n", func, line); > - > - if (in_interrupt()) { > - hvnd_error("In interrupt func is: %s; line is %d\n", > - func, line); > - return; > - } > -} > - > -static struct ib_ah *hvnd_ah_create(struct ib_pd *pd, > - struct ib_ah_attr *ah_attr, > - struct ib_udata *udata) > -{ > - debug_check(__func__, __LINE__); > - return ERR_PTR(-ENOSYS); > -} > - > -static int hvnd_ah_destroy(struct ib_ah *ah) > -{ > - debug_check(__func__, __LINE__); > - return -ENOSYS; > -} > - > -static int hvnd_multicast_attach(struct ib_qp *ibqp, union ib_gid *gid, u16 lid) > -{ > - debug_check(__func__, __LINE__); > - return -ENOSYS; > -} > - > -static int hvnd_multicast_detach(struct ib_qp *ibqp, union ib_gid *gid, u16 lid) > -{ > - debug_check(__func__, __LINE__); > - return -ENOSYS; > -} > - > -static int hvnd_process_mad(struct ib_device *ibdev, > - int mad_flags, > - u8 port_num, > - const struct ib_wc *in_wc, > - const struct ib_grh *in_grh, > - const struct ib_mad_hdr *in_mad, > - size_t in_mad_size, > - struct ib_mad_hdr *out_mad, > - size_t *out_mad_size, > - u16 *out_mad_pkey_index) > -{ > - debug_check(__func__, __LINE__); > - return -ENOSYS; > -} > - > -void hvnd_acquire_uctx_ref(struct hvnd_ucontext *uctx) > -{ > - atomic_inc(&uctx->refcnt); > -} > - > -void hvnd_drop_uctx_ref(struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx) > -{ > - if (atomic_dec_and_test(&uctx->refcnt)) { > - hvnd_debug("uctx ref cnt dropped it is %d\n", > - atomic_read(&uctx->refcnt)); > - hvnd_debug("About to close adaptor\n"); > - hvnd_close_adaptor(nd_dev, uctx); > - } else > - hvnd_debug("uctx ref cnt dropped it is %d\n", > - atomic_read(&uctx->refcnt)); > -} > - > - > -static int hvnd_dealloc_ucontext(struct ib_ucontext *context) > -{ > - struct hvnd_dev *nd_dev; > - struct hvnd_ucontext *uctx; > - > - uctx = to_nd_context(context); > - nd_dev = to_nd_dev(context->device); > - > - hvnd_debug("calling %s\n", __func__); > - > - hvnd_drop_uctx_ref(nd_dev, uctx); > - > - return 0; > -} > - > -static struct ib_ucontext *hvnd_alloc_ucontext(struct ib_device *ibdev, > - struct ib_udata *udata) > -{ > - struct hvnd_dev *nd_dev = to_nd_dev(ibdev); > - struct hvnd_ucontext *uctx; > - struct mlx4_ib_alloc_ucontext_resp resp; > - struct ibv_get_context_resp *ctx_output; > - int ret; > - > - if (!nd_dev->ib_active) { > - hvnd_error("ib device is not active, try again\n"); > - return ERR_PTR(-EAGAIN); > - } > - > - uctx = get_uctx(nd_dev, current_pid()); > - if (uctx) { > - // it is already opened, just increase its reference count > - hvnd_acquire_uctx_ref(uctx); > - } else { > - > - /* > - * The Windows host expects the following to be done: > - * 1. Successfully send struct ndv_pkt_hdr_create_1 > - * 2. INIT PROVIDER > - * 3. Open Adapter > - * Before we can complete this call. > - */ > - > - uctx = kzalloc(sizeof(struct hvnd_ucontext), GFP_KERNEL); > - if (!uctx) > - return ERR_PTR(-ENOMEM); > - > - atomic_set(&uctx->refcnt, 1); > - INIT_LIST_HEAD(&uctx->db_page_list); > - mutex_init(&uctx->db_page_mutex); > - > - /* > - * Stash away the context with the calling PID. > - */ > - ret = insert_handle(nd_dev, &nd_dev->uctxidr, uctx, > - current_pid()); > - if (ret) { > - hvnd_error("Uctx ID insertion failed; ret is %d\n", > - ret); > - goto err1; > - } > - > - hvnd_debug("Opening adaptor pid is %d\n", current_pid()); > - > - ret = hvnd_open_adaptor(nd_dev, uctx); > - if (ret) { > - hvnd_error("hvnd_open_adaptor failed ret=%d\n", ret); > - goto err1; > - } > - > - } > - > - /* > - * Copy the response out. > - */ > - ctx_output = &uctx->o_adap_pkt.mappings.ctx_output; > - resp.dev_caps = MLX4_USER_DEV_CAP_64B_CQE; > - resp.qp_tab_size = ctx_output->qp_tab_size; > - resp.bf_reg_size = ctx_output->bf_reg_size; > - resp.bf_regs_per_page = ctx_output->bf_regs_per_page; > - resp.cqe_size = ctx_output->cqe_size; > - > - ret = ib_copy_to_udata(udata, &resp, sizeof(resp)); > - if (ret) { > - hvnd_error("ib_copy_to_udata failed ret=%d\n", ret); > - goto err1; > - } > - > - return &uctx->ibucontext; > - > -err1: > - kfree(uctx); > - return ERR_PTR(ret); > -} > - > -static int hvnd_mmap(struct ib_ucontext *context, struct vm_area_struct *vma) > -{ > - struct hvnd_ucontext *uctx = to_nd_context(context); > - > - if (vma->vm_end - vma->vm_start != PAGE_SIZE) { > - hvnd_error("vma not a page size, actual size=%lu\n", > - vma->vm_end - vma->vm_start); > - return -EINVAL; > - } > - > - if (vma->vm_pgoff == 0) { > - vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); > - > - if (io_remap_pfn_range(vma, vma->vm_start, > - (uctx->uar_base >> PAGE_SHIFT), > - PAGE_SIZE, vma->vm_page_prot)) { > - hvnd_error("io_remap_pfn_range failure\n"); > - return -EAGAIN; > - } > - } else if (vma->vm_pgoff == 1 && uctx->bf_buf_size != 0) { > - vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot); > - > - if (io_remap_pfn_range(vma, vma->vm_start, > - (uctx->uar_base >> PAGE_SHIFT) + 1, > - PAGE_SIZE, vma->vm_page_prot)) { > - hvnd_error("io_remap_pfn_range failure\n"); > - return -EAGAIN; > - } > - } else { > - hvnd_error("check code\n"); > - return -EINVAL; > - } > - > - return 0; > -} > - > -static int hvnd_deallocate_pd(struct ib_pd *pd) > -{ > - > - struct hvnd_ucontext *uctx; > - struct hvnd_dev *nd_dev; > - struct hvnd_ib_pd *hvnd_pd; > - struct ib_ucontext *ibuctx = pd->uobject->context; > - > - hvnd_pd = to_nd_pd(pd); > - nd_dev = to_nd_dev(pd->device); > - uctx = to_nd_context(ibuctx); > - > - hvnd_free_handle(nd_dev, uctx, hvnd_pd->handle, > - IOCTL_ND_PD_FREE); > - > - hvnd_drop_uctx_ref(nd_dev, uctx); > - return 0; > -} > - > -static struct ib_pd *hvnd_allocate_pd(struct ib_device *ibdev, > - struct ib_ucontext *context, > - struct ib_udata *udata) > -{ > - struct hvnd_ucontext *uctx; > - struct hvnd_dev *nd_dev; > - int ret; > - struct hvnd_ib_pd *hvnd_pd; > - > - if (!context) { > - hvnd_error("kernel mode context not supported\n"); > - return ERR_PTR(-EINVAL); > - } > - > - hvnd_pd = kzalloc(sizeof(struct hvnd_ib_pd), GFP_KERNEL); > - > - if (!hvnd_pd) > - return ERR_PTR(-ENOMEM); > - > - uctx = to_nd_context(context); > - nd_dev = to_nd_dev(ibdev); > - > - ret = hvnd_create_pd(uctx, nd_dev, hvnd_pd); > - if (ret) { > - hvnd_error("hvnd_create_pd failure ret=%d\n", ret); > - goto error_cr_pd; > - } > - > - if (context) { > - if (ib_copy_to_udata(udata, &hvnd_pd->pdn, sizeof(__u32))) { > - hvnd_error("ib_copy_to_udata failure\n"); > - ret = -EFAULT; > - goto error_fault; > - } > - } > - > - hvnd_acquire_uctx_ref(uctx); > - return &hvnd_pd->ibpd; > - > -error_fault: > - hvnd_free_handle(nd_dev, uctx, hvnd_pd->handle, > - IOCTL_ND_PD_FREE); > - > -error_cr_pd: > - kfree(hvnd_pd); > - return ERR_PTR(ret); > -} > - > -static int hvnd_query_pkey(struct ib_device *ibdev, u8 port, u16 index, > - u16 *pkey) > -{ > - debug_check(__func__, __LINE__); > - *pkey = 0; > - return 0; > -} > - > -static int hvnd_query_gid(struct ib_device *ibdev, u8 port, int index, > - union ib_gid *gid) > -{ > - int ret; > - struct hvnd_dev *nd_dev = to_nd_dev(ibdev); > - > - debug_check(__func__, __LINE__); > - > - ret = wait_for_completion_timeout(&nd_dev->addr_set, 60*HZ); > - if (!ret) > - return -ETIMEDOUT; > - > - memset(&(gid->raw[0]), 0, sizeof(gid->raw)); > - memcpy(&(gid->raw[0]), nd_dev->mac_addr, 6); > - return 0; > -} > - > -static int hvnd_query_device(struct ib_device *ibdev, > - struct ib_device_attr *props, > - struct ib_udata *udata) > -{ > - struct hvnd_dev *nd_dev = to_nd_dev(ibdev); > - struct adapter_info_v2 *adap_info; > - > - if (!nd_dev->query_pkt_set) { > - hvnd_error("query packet not received yet\n"); > - return 0; > - } > - > - adap_info = &nd_dev->query_pkt.ioctl.ad_info; > - > - memset(props, 0, sizeof(*props)); > - > - /* > - * Copy the relevant properties out. > - */ > - props->fw_ver = 0; > - props->device_cap_flags = 0; > - //props->device_cap_flags |= IB_DEVICE_BAD_PKEY_CNTR; > - //props->device_cap_flags |= IB_DEVICE_BAD_QKEY_CNTR; > - //props->device_cap_flags |= IB_DEVICE_XRC; > - > - props->vendor_id = 0x15b3; > - props->vendor_part_id = adap_info->device_id; > - > - props->max_mr_size = ~0ull; > - props->page_size_cap = PAGE_SIZE; > - props->max_qp = 16384; > - props->max_qp_wr = min(adap_info->max_recv_q_depth, > - adap_info->max_initiator_q_depth); > - > - props->max_sge = min(adap_info->max_initiator_sge, > - adap_info->max_recv_sge); > - props->max_cq = 0x1FFFF; > - props->max_cqe = adap_info->max_completion_q_depth; > - props->max_mr = 16384; > - props->max_pd = 16384; > - > - props->max_qp_rd_atom = adap_info->max_inbound_read_limit; > - props->max_qp_init_rd_atom = adap_info->max_outbound_read_limit; > - props->max_res_rd_atom = props->max_qp_rd_atom * props->max_qp; > - props->max_srq = 16384; > - props->max_srq_wr = adap_info->max_recv_q_depth; > - props->max_srq_sge = adap_info->max_recv_sge; > - > - return 0; > -} > - > -static int hvnd_query_port(struct ib_device *ibdev, u8 port, > - struct ib_port_attr *props) > -{ > - memset(props, 0, sizeof(struct ib_port_attr)); > - > - props->max_mtu = IB_MTU_4096; > - props->active_mtu = IB_MTU_4096; > - > - /* > - * KYS: TBD need to base this on netdev. > - */ > - props->state = IB_PORT_ACTIVE; > - > - props->port_cap_flags = IB_PORT_CM_SUP; > - > - props->gid_tbl_len = 1; > - props->pkey_tbl_len = 1; > - props->active_width = 1; > - props->active_speed = IB_SPEED_DDR; //KYS: check > - props->max_msg_sz = -1; > - > - return 0; > -} > - > -static enum rdma_link_layer > -hvnd_get_link_layer(struct ib_device *device, u8 port) > -{ > - return IB_LINK_LAYER_ETHERNET; > -} > - > -static ssize_t hvnd_show_rev(struct device *dev, struct device_attribute *attr, > - char *buf) > -{ > - return 0; > -} > - > -static ssize_t hvnd_show_fw_ver(struct device *dev, > - struct device_attribute *attr, char *buf) > -{ > - return 0; > -} > - > -static ssize_t hvnd_show_hca(struct device *dev, struct device_attribute *attr, > - char *buf) > -{ > - return 0; > -} > - > -static ssize_t hvnd_show_board(struct device *dev, > - struct device_attribute *attr, char *buf) > -{ > - return 0; > -} > - > -static int hvnd_get_port_immutable(struct ib_device *ibdev, u8 port_num, > - struct ib_port_immutable *immutable) > -{ > - struct ib_port_attr attr; > - int err; > - > - err = hvnd_query_port(ibdev, port_num, &attr); > - if (err) > - return err; > - > - immutable->pkey_tbl_len = attr.pkey_tbl_len; > - immutable->gid_tbl_len = attr.gid_tbl_len; > - immutable->core_cap_flags = RDMA_CORE_PORT_IWARP; > - > - return 0; > -} > - > -static struct ib_qp *hvnd_ib_create_qp(struct ib_pd *pd, > - struct ib_qp_init_attr *attrs, > - struct ib_udata *udata) > -{ > - struct hvnd_ucontext *uctx; > - struct hvnd_dev *nd_dev; > - struct mlx4_ib_create_qp ucmd; > - struct hvnd_qp *qp; > - int ret = 0; > - struct hvnd_ib_pd *hvnd_pd = to_nd_pd(pd); > - struct hvnd_cq *send_cq = to_nd_cq(attrs->send_cq); > - struct hvnd_cq *recv_cq = to_nd_cq(attrs->recv_cq); > - > - uctx = get_uctx_from_pd(pd); > - nd_dev = to_nd_dev(pd->device); > - > - if (attrs->qp_type != IB_QPT_RC) { > - hvnd_error("attrs->qp_type=%d not IB_QPT_RC\n", attrs->qp_type); > - return ERR_PTR(-EINVAL); > - } > - > - qp = kzalloc(sizeof(*qp), GFP_KERNEL); > - if (!qp) { > - ret = -ENOMEM; > - goto err_done; > - } > - > - qp->uctx = uctx; > - > - if (ib_copy_from_udata(&ucmd, udata, sizeof(ucmd))) { > - hvnd_error("ib_copy_from_udata failed\n"); > - ret = -EFAULT; > - goto err_ucpy; > - } > - > - qp->qp_buf = (void *)ucmd.buf_addr; > - qp->db_addr = (void *)ucmd.db_addr; > - qp->log_sq_bb_count = ucmd.log_sq_bb_count; > - qp->log_sq_stride = ucmd.log_sq_stride; > - qp->sq_no_prefetch = ucmd.sq_no_prefetch; > - qp->port = attrs->port_num; > - > - init_waitqueue_head(&qp->wait); > - atomic_set(&qp->refcnt, 1); > - > - qp->recv_cq = recv_cq; > - qp->send_cq = send_cq; > - qp->nd_dev = nd_dev; > - > - qp->receive_cq_handle = recv_cq->cq_handle; > - qp->initiator_cq_handle = send_cq->cq_handle; > - qp->pd_handle = hvnd_pd->handle; > - qp->cq_notify = false; > - > - qp->ibqp.qp_num = attrs->qp_type == IB_QPT_SMI ? 0 : 1; > - > - qp->max_inline_data = attrs->cap.max_inline_data; > - > - qp->initiator_q_depth = attrs->cap.max_send_wr; > - qp->initiator_request_sge = attrs->cap.max_send_sge; > - > - > - qp->receive_q_depth = attrs->cap.max_recv_wr; > - qp->receive_request_sge = attrs->cap.max_recv_sge; > - > - set_rq_size(nd_dev, &attrs->cap, qp); > - > - set_user_sq_size(nd_dev, qp, &ucmd); > - > - qp->umem = ib_umem_get(&uctx->ibucontext, ucmd.buf_addr, > - qp->buf_size, 0, 0); > - if (IS_ERR(qp->umem)) { > - ret = PTR_ERR(qp->umem); > - hvnd_error("ib_umem_get failed ret=%d\n", ret); > - goto err_ucpy; > - } > - > - ret = hvnd_db_map_user(uctx, ucmd.db_addr, &qp->db_umem); > - > - if (ret) { > - hvnd_error("hvnd_db_map_user failed ret=%d\n", ret); > - goto err_db_map; > - } > - > - ret = hvnd_create_qp(nd_dev, uctx, qp); > - > - if (ret) { > - hvnd_error("hvnd_create_qp failed ret=%d\n", ret); > - goto err_qp; > - } > - > - hvnd_acquire_uctx_ref(uctx); > - > - qp->ibqp.qp_num = qp->qpn; > - qp->ibqp.qp_type = IB_QPT_RC; > - > - > - return &qp->ibqp; > - > -err_qp: > - hvnd_db_unmap_user(uctx, ucmd.db_addr); > - > -err_db_map: > - ib_umem_release(qp->umem); > - > -err_ucpy: > - kfree(qp); > -err_done: > - return ERR_PTR(ret); > -} > - > -static int hvnd_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, > - int attr_mask, struct ib_udata *udata) > -{ > - struct hvnd_qp *qp = to_nd_qp(ibqp); > - struct hvnd_dev *nd_dev = to_nd_dev(ibqp->device); > - enum ib_qp_state cur_state, new_state; > - int ret = 0; > - > - if (attr != NULL) { > - cur_state = (attr_mask & IB_QP_CUR_STATE) > - ? attr->cur_qp_state : qp->qp_state; > - new_state = (attr_mask & IB_QP_STATE) > - ? attr->qp_state : cur_state; > - > - hvnd_debug("qp->qp_state is %d new state is %d\n", > - qp->qp_state, new_state); > - hvnd_debug("current qp state is %d\n", cur_state); > - if (attr_mask & IB_QP_STATE) { > - /* Ensure the state is valid */ > - if (attr->qp_state < 0 || attr->qp_state > IB_QPS_ERR) { > - hvnd_error("incorrect qp state attr->qp_state=%d\n", > - attr->qp_state); > - return -EINVAL; > - } > - > - if (qp->qp_state != new_state) { > - qp->qp_state = new_state; > - /* > - * The only state transition supported is the > - * transition to error state. > - */ > - switch (new_state) { > - case IB_QPS_ERR: > - case IB_QPS_SQD: > - ret = hvnd_flush_qp(nd_dev, qp->uctx, > - qp); > - > - if (ret) > - hvnd_error("hvnd_flush_qp failed ret=%d\n", ret); > - > - // immediately notify the upper layer > - // on disconnection > - if (!ret && qp->connector) > - hvnd_process_notify_disconnect( > - qp->connector, > - STATUS_SUCCESS); > - > - return ret; > - > - default: > - break; > - } > - } > - } > - } > - return 0; > -} > - > - > -static int hvnd_ib_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, > - int attr_mask, struct ib_qp_init_attr *init_attr) > -{ > - struct hvnd_qp *qp = to_nd_qp(ibqp); > - > - memset(attr, 0, sizeof(*attr)); > - memset(init_attr, 0, sizeof(*init_attr)); > - > - attr->qp_state = qp->qp_state; > - > - init_attr->cap.max_send_wr = qp->max_send_wr; > - init_attr->cap.max_recv_wr = qp->max_recv_wr; > - > - init_attr->cap.max_send_sge = qp->max_send_sge; > - init_attr->cap.max_recv_sge = qp->max_recv_sge; > - init_attr->cap.max_inline_data = qp->max_inline_data; > - > - init_attr->sq_sig_type = IB_SIGNAL_ALL_WR; > - > - return 0; > -} > - > -static void hvnd_refuse_connection(struct hvnd_ep_obj *connector, int status); > -static int hvnd_destroy_qp(struct ib_qp *ib_qp) > -{ > - int ret; > - struct hvnd_qp *qp = to_nd_qp(ib_qp); > - struct hvnd_dev *nd_dev = to_nd_dev(ib_qp->device); > - u64 jiffies; > - > - if (!qp->connector) { > - hvnd_warn("error: connector is NULL; skip destroying connector\n"); > - goto free_qp; > - } > - > - /* > - * should we flush the qp first on ctrl-C?, > - * no need to disconnect on abrupt shutdown? > - */ > - if (qp->qp_state != IB_QPS_ERR && qp->qp_state != IB_QPS_SQD) { > - hvnd_warn("qp_state=%d, doing abrupt disconnect\n", > - qp->qp_state); > - hvnd_flush_qp(nd_dev, qp->uctx, qp); > - > - ep_stop(qp->connector); > - > - // now no pending activity is possible on the connector > - > - switch (qp->connector->cm_state) { > - > - case hvnd_cm_idle: > - case hvnd_cm_connect_reply_refused: > - case hvnd_cm_connect_request_sent: > - case hvnd_cm_close_sent: > - hvnd_warn("cm_state = %d not doing anything\n", > - qp->connector->cm_state); > - break; > - > - case hvnd_cm_connect_received: > - hvnd_warn("cm_state = %d refusing pending connection request\n", > - qp->connector->cm_state); > - hvnd_refuse_connection(qp->connector, -ECONNREFUSED); > - break; > - > - case hvnd_cm_connect_reply_sent: > - case hvnd_cm_established_sent: > - case hvnd_cm_accept_sent: > - hvnd_warn("cm_state = %d notifying disconnect on existing connection\n", > - qp->connector->cm_state); > - hvnd_process_notify_disconnect(qp->connector, > - STATUS_CANCELLED); > - break; > - > - default: > - hvnd_error("unknown cm_state = %d\n", > - qp->connector->cm_state); > - > - } > - goto free_connector; > - } else { > - hvnd_debug("qp_state=%d, doing normal disconnect\n", > - qp->qp_state); > - } > - > - if (!ep_add_work_pending(qp->connector)) > - goto free_connector; > - > - init_completion(&qp->connector->disconnect_event); > - > - /* > - * First issue a disconnect on the connector. > - */ > - > - hvnd_debug("calling hvnd_connector_disconnect\n"); > - ret = hvnd_connector_disconnect(nd_dev, qp->uctx, > - qp->connector->ep_handle, > - qp->connector); > - if (ret) { > - ep_del_work_pending(qp->connector); > - hvnd_error("disconnect: retval is %d\n", ret); > - ep_stop(qp->connector); > - goto free_connector; > - } > - /* > - * Now wait for the disconnect. > - */ > - jiffies = get_jiffies_64(); > - wait_for_completion(&qp->connector->disconnect_event); > - hvnd_debug("Completed disconnect connector=%p jiffies=%llu\n", > - qp->connector, get_jiffies_64() - jiffies); > - > - /* > - * Now free up the connector and drop the reference on uctx. > - */ > - > - ep_stop(qp->connector); > - > -free_connector: > - hvnd_debug("destroying connector handle: %p\n", > - (void *) qp->connector->ep_handle); > - hvnd_free_handle(nd_dev, qp->uctx, > - qp->connector->ep_handle, > - IOCTL_ND_CONNECTOR_FREE); > - > - hvnd_drop_uctx_ref(nd_dev, qp->uctx); > - hvnd_destroy_ep(qp->connector); > - qp->connector = NULL; > -free_qp: > - atomic_dec(&qp->refcnt); > - hvnd_debug("Waiting for the ref cnt to go to 0\n"); > - > - wait_event(qp->wait, !atomic_read(&qp->refcnt)); > - > - hvnd_debug("About to destroy qp\n"); > - hvnd_db_unmap_user(qp->uctx, (u64)qp->db_addr); > - ib_umem_release(qp->umem); > - > - hvnd_debug("About to free qp\n"); > - ret = hvnd_free_qp(nd_dev, qp->uctx, qp); > - > - if (ret == 0) { > - hvnd_drop_uctx_ref(nd_dev, qp->uctx); > - kfree(qp); > - } else { > - hvnd_error("free qp failed: ret is %d\n", ret); > - } > - > - return ret; > -} > - > -static struct ib_cq *hvnd_ib_create_cq(struct ib_device *ibdev, > - const struct ib_cq_init_attr *attr, > - struct ib_ucontext *ib_context, > - struct ib_udata *udata) > -{ > - struct hvnd_ucontext *uctx; > - struct hvnd_dev *nd_dev; > - struct mlx4_ib_create_cq ucmd; > - struct hvnd_cq *cq; > - int ret = 0; > - int entries = attr->cqe; > - > - uctx = to_nd_context(ib_context); > - nd_dev = to_nd_dev(ibdev); > - > - if (entries < 1 || entries > uctx->max_cqe) { > - hvnd_error("incorrct entries=%d\n", entries); > - ret = -EINVAL; > - goto err_done; > - } > - > - cq = kzalloc(sizeof(*cq), GFP_KERNEL); > - if (!cq) { > - ret = -ENOMEM; > - goto err_done; > - } > - > - entries = roundup_pow_of_two(entries + 1); > - cq->ibcq.cqe = entries - 1; > - cq->entries = entries; > - cq->uctx = uctx; > - > - if (ib_copy_from_udata(&ucmd, udata, sizeof(ucmd))) { > - hvnd_error("ib_copy_from_udata failed\n"); > - ret = -EFAULT; > - goto err_ucpy; > - } > - > - cq->cq_buf = (void *)ucmd.buf_addr; > - cq->db_addr = (void *)ucmd.db_addr; > - cq->arm_sn = 0; > - > - /* > - * Initialize the IRP state. Need to have a separate irp state > - * for CQ; for now share it with Listener/connector. > - */ > - ret = hvnd_init_ep(&cq->ep_object, NULL, ND_CQ, nd_dev, uctx); > - > - if (ret) { > - hvnd_error("hvnd_init_ep failed ret=%d\n", ret); > - goto err_ucpy; > - } > - > - cq->ep_object.cq = cq; > - cq->monitor = true; > - > - cq->umem = ib_umem_get(ib_context, ucmd.buf_addr, > - (entries * uctx->cqe_size), > - IB_ACCESS_LOCAL_WRITE, 1); > - if (IS_ERR(cq->umem)) { > - ret = IS_ERR(cq->umem); > - hvnd_error("ib_umem_get failed ret=%d\n", ret); > - goto err_ucpy; > - } > - > - ret = hvnd_db_map_user(uctx, ucmd.db_addr, &cq->db_umem); > - > - if (ret) { > - hvnd_error("hvnd_db_map_user failed ret=%d\n", ret); > - goto err_db_map; > - } > - > - ret = hvnd_create_cq(nd_dev, uctx, cq); > - > - if (ret) { > - hvnd_error("hvnd_create_cq failed ret=%d\n", ret); > - goto err_cq; > - } > - > - cq->ep_object.ep_handle = cq->cq_handle; > - > - if (ib_copy_to_udata(udata, &cq->cqn, sizeof(__u32))) { > - hvnd_error("ib_copy_to_udata failed\n"); > - ret = -EFAULT; > - goto err_ucpy_out; > - } > - > - if (!disable_cq_notify) { > - > - if (!ep_add_work_pending(&cq->ep_object)) > - goto err_ucpy_out; > - > - ret = hvnd_notify_cq(nd_dev, cq, ND_CQ_NOTIFY_ANY, > - (u64)&cq->ep_object); > - > - if (ret) { > - ep_del_work_pending(&cq->ep_object); > - hvnd_error("hvnd_notify_cq failed ret=%d\n", ret); > - goto err_ucpy_out; > - } > - } > - > - hvnd_acquire_uctx_ref(uctx); > - > - return &cq->ibcq; > - > -err_ucpy_out: > - hvnd_destroy_cq(nd_dev, cq); > - > -err_cq: > - hvnd_db_unmap_user(uctx, ucmd.db_addr); > - > -err_db_map: > - ib_umem_release(cq->umem); > - > -err_ucpy: > - kfree(cq); > -err_done: > - return ERR_PTR(ret); > -} > - > -static struct ib_qp *hvnd_get_qp(struct ib_device *dev, int qpn) > -{ > - struct hvnd_dev *nd_dev; > - struct hvnd_qp *qp = NULL; > - > - nd_dev = to_nd_dev(dev); > - qp = get_qpp(nd_dev, qpn); > - return qp ? &qp->ibqp : NULL; > -} > - > -static int hvnd_ib_destroy_cq(struct ib_cq *ib_cq) > -{ > - struct hvnd_ucontext *uctx; > - struct hvnd_dev *nd_dev; > - struct hvnd_cq *cq; > - > - cq = to_nd_cq(ib_cq); > - uctx = cq->uctx; > - nd_dev = to_nd_dev(uctx->ibucontext.device); > - > - cq->monitor = false; > - > -// hvnd_cancel_io(&cq->ep_object); > - ep_stop(&cq->ep_object); > - > - hvnd_deinit_ep(&cq->ep_object); > - > - hvnd_db_unmap_user(uctx, (u64)cq->db_addr); > - ib_umem_release(cq->umem); > - > - hvnd_destroy_cq(nd_dev, cq); > - > - hvnd_drop_uctx_ref(nd_dev, uctx); > - kfree(cq); > - > - return 0; > -} > - > -static int hvnd_resize_cq(struct ib_cq *cq, int cqe, struct ib_udata *udata) > -{ > - /* > - * NDDirect does not support resizing CQ. > - */ > - hvnd_info("check code\n"); > - return -ENOSYS; > -} > - > -static int hvnd_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc) > -{ > - hvnd_info("check code\n"); > - return 0; > -} > - > -static struct ib_mr *hvnd_get_dma_mr(struct ib_pd *pd, int acc) > -{ > - hvnd_info("check code\n"); > - return NULL; > -} > - > -static void debug_dump_umem(struct ib_umem *umem) > -{ > -#ifdef HVND_MEM_DEBUG > - struct ib_umem_chunk *chunk; > - struct scatterlist *sg; > - int len, j, entry; > - int shift = ffs(umem->page_size) - 1; > - > - hvnd_debug("umem=%p\n", umem); > - hvnd_debug("context=%p length=%lu offset=%d page_size=%d writable=%d hugetlb=%d\n", > - umem->context, > - umem->length, > - umem->offset, > - umem->page_size, > - umem->writable, > - umem->hugetlb); > - > - list_for_each_entry(chunk, &umem->chunk_list, list) { > - hvnd_debug("chunk->nmap=%d\n", chunk->nmap); > - for (j = 0; j < chunk->nmap; ++j) { > - sg = &chunk->page_list[j]; > - hvnd_debug("sg_dma_len=%d sg_dma_address=%llx\n", > - sg_dma_len(sg), sg_dma_address(sg)); > - hvnd_debug("page_link=%lx offset=%u length=%u\n", > - sg->page_link, sg->offset, sg->length); > - len = sg_dma_len(&chunk->page_list[j]) >> shift; > - for_each_sg(&chunk->page_list[j], sg, len, entry) { > - hvnd_debug("PFN=%lu\n", > - page_to_pfn(sg_page(sg))); > - } > - } > - } > -#endif > -} > - > - > -static struct ib_mr *hvnd_reg_user_mr(struct ib_pd *pd, u64 start, u64 length, > - u64 virt, int acc, struct ib_udata *udata) > -{ > - int err = 0; > - struct hvnd_ib_pd *hvndpd = to_nd_pd(pd); > - struct hvnd_mr *mr; > - > - mr = kmalloc(sizeof(*mr), GFP_KERNEL); > - if (!mr) > - return ERR_PTR(-ENOMEM); > - mr->pd = hvndpd; > - > - mr->umem = ib_umem_get(pd->uobject->context, start, length, acc, 0); > - if (IS_ERR(mr->umem)) { > - err = PTR_ERR(mr->umem); > - hvnd_error("ib_umem_get failed ret=%d\n", err); > - kfree(mr); > - return ERR_PTR(err); > - } > - > - debug_dump_umem(mr->umem); > - > - mr->start = start; > - mr->length = length; > - mr->virt = virt; > - mr->acc = acc; > - > - hvnd_debug("start=%llx length=%llx virt=%llx acc=%d\n", > - start, length, virt, acc); > - > - /* > - * First create a memory region. > - */ > - err = hvnd_cr_mr(to_nd_dev(pd->device), > - to_nd_context(pd->uobject->context), hvndpd->handle, > - &mr->mr_handle); > - if (err) { > - hvnd_error("cr_mr failed; ret is %d\n", err); > - goto err; > - } > - > - err = hvnd_mr_register(to_nd_dev(pd->device), > - to_nd_context(pd->uobject->context), mr); > - > - if (err) > - goto err0; > - > - hvnd_acquire_uctx_ref(to_nd_context(pd->uobject->context)); > - > - return &mr->ibmr; > - > -err0: > - hvnd_free_mr(to_nd_dev(pd->device), > - to_nd_context(pd->uobject->context), mr->mr_handle); > -err: > - ib_umem_release(mr->umem); > - kfree(mr); > - return ERR_PTR(err); > -} > - > - > - > -static int hvnd_dereg_mr(struct ib_mr *ib_mr) > -{ > - int ret; > - struct hvnd_mr *mr = to_nd_mr(ib_mr); > - struct hvnd_ucontext *uctx = to_nd_context(ib_mr->pd->uobject->context); > - struct hvnd_dev *nd_dev = to_nd_dev(ib_mr->device); > - > - > - hvnd_debug("dereg_mr entering\n"); > - > - ret = hvnd_deregister_mr(nd_dev, uctx, mr->mr_handle); > - > - if (ret) { > - hvnd_error("hvnd_deregister_mr() failed: %x\n", ret); > - return ret; > - } > - /* > - * Now free up the memory region. > - */ > - > - ret = hvnd_free_mr(nd_dev, uctx, mr->mr_handle); > - if (ret) { > - hvnd_error("hvnd_free_mr() failed: %x\n", ret); > - return ret; > - } > - > - ib_umem_release(mr->umem); > - > - hvnd_drop_uctx_ref(nd_dev, uctx); > - kfree(mr); > - > - hvnd_debug("dereg_mr done\n"); > - return 0; > -} > - > -static struct ib_mw *hvnd_alloc_mw(struct ib_pd *pd, enum ib_mw_type type, > - struct ib_udata *udata) > -{ > - hvnd_info("check code\n"); > - return NULL; > -} > - > -static int hvnd_dealloc_mw(struct ib_mw *mw) > -{ > - debug_check(__func__, __LINE__); > - return 0; > -} > - > -static int hvnd_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags) > -{ > - struct hvnd_ucontext *uctx; > - struct hvnd_dev *nd_dev; > - struct hvnd_cq *cq; > - > - cq = to_nd_cq(ibcq); > - uctx = cq->uctx; > - nd_dev = to_nd_dev(uctx->ibucontext.device); > - > - > - debug_check(__func__, __LINE__); > - > - return 0; > -} > - > -static int hvnd_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, > - struct ib_send_wr **bad_wr) > -{ > - debug_check(__func__, __LINE__); > - return 0; > -} > - > -int hvnd_post_receive(struct ib_qp *ibqp, struct ib_recv_wr *wr, > - struct ib_recv_wr **bad_wr) > -{ > - debug_check(__func__, __LINE__); > - return 0; > -} > - > -static int hvnd_resolve_addr(struct sockaddr_in *laddr, > - struct sockaddr_in *raddr, > - struct if_physical_addr *phys_addrstruct) > -{ > - int ret; > - > - phys_addrstruct->length = ETH_ALEN; > - ret = hvnd_get_neigh_mac_addr((struct sockaddr *)laddr, > - (struct sockaddr *)raddr, > - phys_addrstruct->addr); > - > - hvnd_debug("Dest MAC is %pM\n", phys_addrstruct->addr); > - return ret; > -} > - > -static int hvnd_connect(struct iw_cm_id *cm_id, > - struct iw_cm_conn_param *conn_param) > -{ > - int ret = 0; > - struct hvnd_dev *nd_dev; > - struct hvnd_ep_obj *ep_object; > - struct sockaddr_in *raddr = (struct sockaddr_in *)&cm_id->remote_addr; > - struct sockaddr_in *laddr = (struct sockaddr_in *)&cm_id->local_addr; > - struct hvnd_qp *qp; > - struct if_physical_addr phys_addrstruct; > - union nd_sockaddr_inet dest_addr; > - u64 connector_handle; > - union nd_sockaddr_inet addr; > - char addr_buf[50]; > - > - if (cm_id->remote_addr.ss_family != AF_INET) { > - hvnd_error("cm_id->remote_addr.ss_family=%d not AF_INET\n", > - cm_id->remote_addr.ss_family); > - return -ENOSYS; > - } > - > - qp = get_qpp(to_nd_dev(cm_id->device), conn_param->qpn); > - > - if (!qp) { > - hvnd_error("failed to find qp conn_param->qpn=%d\n", > - conn_param->qpn); > - return -EINVAL; > - } > - > - cm_id->provider_data = qp; > - cm_id->add_ref(cm_id); > - qp->cm_id = cm_id; > - > - /* > - * Set the read/write limits. > - * Can we change the limits on a created QP? Luke? > - */ > - nd_dev = to_nd_dev(cm_id->device); > - ep_object = hvnd_setup_ep(cm_id, ND_CONNECTOR, nd_dev, qp->uctx); > - hvnd_debug("active connection: local irp is %d\n", > - ep_object->local_irp); > - if (!ep_object) { > - hvnd_error("hvnd_setup_ep failure\n"); > - ret = -ENOMEM; > - goto err_limit; > - } > - > - ret = hvnd_cr_connector(nd_dev, qp->uctx, > - &connector_handle); > - > - if (ret) { > - hvnd_error("hvnd_cr_connector failure ret=%d\n", ret); > - goto err_cr_connector; > - } > - > - hvnd_acquire_uctx_ref(qp->uctx); > - ep_object->ep_handle = connector_handle; > - ep_object->incoming = false; > - qp->connector = ep_object; > - > - /* > - * Bind the local address to the connector. > - */ > - hvnd_debug("Connect local address is %s\n", > - debug_inet_ntoa(laddr->sin_addr, addr_buf)); > - > - memcpy(&addr.ipv4, laddr, sizeof(struct sockaddr_in)); > - hvnd_debug("CONNECT AF %d port %d addr %s\n", > - addr.ipv4.sin_family, addr.ipv4.sin_port, > - debug_inet_ntoa(addr.ipv4.sin_addr, addr_buf)); > - > - ret = hvnd_bind_connector(nd_dev, qp->uctx, > - connector_handle, > - &addr); > - > - if (ret) { > - hvnd_error("hvnd_bind_connector failed ret=%d\n", ret); > - goto err_bind_connector; > - } > - > - ret = hvnd_resolve_addr(laddr, raddr, &phys_addrstruct); > - if (ret) { > - hvnd_error("hvnd_resolve_addr failed ret=%d\n", ret); > - goto err_bind_connector; > - } > - > - memcpy(&dest_addr.ipv4, raddr, sizeof(struct sockaddr_in)); > - > - > - > - /* > - * Now attempt to connect. > - */ > - > - hvnd_debug("About to initiate connection\n"); > - > - if (!ep_add_work_pending(ep_object)) > - goto err_bind_connector; > - > - ep_object->cm_state = hvnd_cm_connect_received; > - ret = hvnd_connector_connect(nd_dev, qp->uctx, > - ep_object->ep_handle, > - conn_param->ird, conn_param->ord, > - conn_param->private_data_len, > - (u8 *)conn_param->private_data, > - qp->qp_handle, > - &phys_addrstruct, &dest_addr, > - ep_object); > - > - if (ret == 0) > - return 0; > - > - ep_object->cm_state = hvnd_cm_idle; > - ep_del_work_pending(ep_object); > - hvnd_error("hvnd_connector_connect failed ret=%d\n", ret); > - > -err_bind_connector: > - qp->connector = NULL; > - hvnd_free_connector(nd_dev, qp->uctx, > - connector_handle); > - hvnd_drop_uctx_ref(nd_dev, qp->uctx); > - > -err_cr_connector: > - kfree(ep_object); > - > -err_limit: > - cm_id->provider_data = NULL; > - qp->cm_id = NULL; > - cm_id->rem_ref(cm_id); > - return ret; > -} > - > -static int hvnd_accept_cr(struct iw_cm_id *cm_id, > - struct iw_cm_conn_param *conn_param) > -{ > - int ret = 0; > - struct hvnd_dev *nd_dev; > - struct hvnd_qp *qp; > - struct hvnd_ep_obj *connector; > - enum ibv_qp_state new_qp_state; > - > - hvnd_debug("Accepting connection - PASSIVE\n"); > - nd_dev = to_nd_dev(cm_id->device); > - qp = get_qpp(to_nd_dev(cm_id->device), conn_param->qpn); > - > - if (!qp) { > - hvnd_error("get_qpp failed conn_param->qpn=%d\n", > - conn_param->qpn); > - return -EINVAL; > - } > - > - > - connector = (struct hvnd_ep_obj *)cm_id->provider_data; > - qp->connector = connector; > - > - if (connector == NULL) { > - hvnd_error("NULL connector!\n"); > - return -EINVAL; > - } > - hvnd_debug("connector's cm_id is %p caller cm_id=%p\n", > - connector->cm_id, cm_id); > - > - connector->cq = qp->recv_cq; > - > - > - /* > - * Setup state for the accepted connection. > - */ > - cm_id->add_ref(cm_id); > - connector->cm_id = cm_id; > - if (conn_param == NULL) { > - hvnd_error("NULL conn_param!\n"); > - return -EINVAL; > - } > - > - connector->ord = conn_param->ord; > - connector->ird = conn_param->ird; > - > - if (!ep_add_work_pending(connector)) > - goto error; > - > - init_completion(&connector->connector_accept_event); > - > - ret = hvnd_connector_accept(nd_dev, qp->uctx, connector->ep_handle, > - qp->qp_handle, conn_param->ird, > - conn_param->ord, > - conn_param->private_data_len, > - conn_param->private_data, > - &new_qp_state, connector); > - > - if (ret) { > - ep_del_work_pending(connector); > - hvnd_error("connector accept failed\n"); > - goto error; > - } > - > - wait_for_completion(&connector->connector_accept_event); > - ret = connector->connector_accept_status; > - > - if (ret) { > - hvnd_error("connector_accept failed status=%x\n", ret); > - ret = -EIO; > - goto error; > - } > - > - hvnd_debug("Passive Connection Accepted; new qp state is %d\n", > - new_qp_state); > - connector->cm_state = hvnd_cm_accept_sent; > - return 0; > - > -error: > - ep_stop(connector); > - connector->cm_id = NULL; > - connector->cm_state = hvnd_cm_idle; > - > - qp->connector = NULL; > - cm_id->rem_ref(cm_id); > - > - return ret; > -} > - > -static int hvnd_reject_cr(struct iw_cm_id *cm_id, const void *pdata, > - u8 pdata_len) > -{ > - debug_check(__func__, __LINE__); > - return 0; > -} > - > -void hvnd_process_disconnect(struct hvnd_ep_obj *ep_object, int status) > -{ > - struct iw_cm_event cm_event; > - > - switch (status) { > - case STATUS_SUCCESS: > - case STATUS_CANCELLED: > - break; > - > - default: > - hvnd_warn("disconnect complete failed: status:%d\n", status); > - } > - > - > - hvnd_debug("active disconnect processed\n"); > - memset(&cm_event, 0, sizeof(cm_event)); > - > - complete(&ep_object->disconnect_event); > -} > - > - > -void hvnd_process_notify_disconnect(struct hvnd_ep_obj *ep_object, int status) > -{ > - struct iw_cm_event cm_event; > - > - // make sure we only disconnect once > - if (atomic_xchg(&ep_object->disconnect_notified, 1)) > - return; > - > - /* > - * Turn off CQ monitoring. > - */ > - if (ep_object->cq) > - ep_object->cq->monitor = false; > - > - switch (ep_object->cm_state) { > - case hvnd_cm_connect_reply_sent: > - case hvnd_cm_established_sent: > - case hvnd_cm_accept_sent: > - break; > - > - default: > - hvnd_error("unexpected cm_state=%d\n", ep_object->cm_state); > - return; > - } > - > - switch (status) { > - case STATUS_SUCCESS: > - case STATUS_CANCELLED: > - case STATUS_DISCONNECTED: > - break; > - > - default: > - hvnd_warn("notify disconnect complete failed: status:%d\n", > - status); > - } > - > - hvnd_debug("passive disconnect notified\n"); > - memset(&cm_event, 0, sizeof(cm_event)); > - > - /* > - * Other end disconnected. > - * Connection has been disconnected; > - * notify the cm layer. > - */ > - cm_event.status = -ECONNRESET; > - cm_event.event = IW_CM_EVENT_CLOSE; > - > - if ((ep_object->cm_id) && > - (ep_object->cm_id->event_handler)) { > - > - ep_object->cm_id->event_handler(ep_object->cm_id, &cm_event); > - > - ep_object->cm_id->rem_ref(ep_object->cm_id); > - ep_object->cm_state = hvnd_cm_close_sent; > - } > -} > - > -void hvnd_process_connector_accept(struct hvnd_ep_obj *ep_object, int status) > -{ > - struct iw_cm_event cm_event; > - int ret; > - > - /* > - * This is the problem area the return status may be: > - * 1: 0xc00000b5 (3221225653) - {Device Timeout} The specified I/O > - * operation on %hs was not completed before the time-out period > - * expired > - * 2: NTSTATUS 0xc0000241 (3221226049) - The transport connection was > - * aborted by the local system. > - * If we do nothing here, iwcm will wait for IW_CM_EVENT_ESTABLISHED > - * forever, and unable to clean shutdown need to fail the call earlier > - * on accept. > - */ > - > - ep_object->connector_accept_status = status; > - > - if (status) { > - hvnd_error("Connector accept failed; status is %x\n", status); > - complete(&ep_object->connector_accept_event); > - return; > - } > - > - memset(&cm_event, 0, sizeof(cm_event)); > - cm_event.event = IW_CM_EVENT_ESTABLISHED; > - cm_event.ird = ep_object->ird; > - cm_event.ord = ep_object->ord; > - cm_event.provider_data = (void *)ep_object; > - > - /* > - * We have successfully passively accepted the > - * incoming connection. > - */ > - > - hvnd_debug("Passive connection accepted!!\n"); > - if ((ep_object->cm_id) && > - (ep_object->cm_id->event_handler)) { > - ep_object->cm_id->event_handler(ep_object->cm_id, &cm_event); > - ep_object->cm_state = hvnd_cm_established_sent; > - } > - > - complete(&ep_object->connector_accept_event); > - > - /* > - * Request notification if the other end > - * were to disconnect. > - */ > - if (!ep_add_work_pending(ep_object)) > - return; > - > - ret = hvnd_connector_notify_disconnect(ep_object->nd_dev, > - ep_object->uctx, > - ep_object->ep_handle, > - ep_object); > - > - if (ret) { > - ep_del_work_pending(ep_object); > - hvnd_error("Connector notify disconnect failed; ret: %d\n", > - ret); > - } > -} > - > - > -void hvnd_process_cq_event_pending(struct hvnd_ep_obj *ep_object, > - int status) > -{ > - > - struct ib_cq *ibcq; > - struct hvnd_cq *cq; > - > - cq = ep_object->cq; > - ibcq = &ep_object->cq->ibcq; > - > - if (!cq->monitor) > - return; > - > - // call the previous CQ complete > - if ((status == STATUS_PENDING) && cq->upcall_pending && > - ibcq->comp_handler) { > - ibcq->comp_handler(ibcq, ibcq->cq_context); > - cq->upcall_pending = false; > - hvnd_debug("CQ comp_handler called arm_sn=%d\n", cq->arm_sn); > - } > - > - if ((status != STATUS_PENDING) && ibcq->comp_handler && > - ibcq->cq_context) { > - ibcq->comp_handler(ibcq, ibcq->cq_context); > - hvnd_error("CQ comp_handler called status=%x\n", status); > - } > -} > - > -void hvnd_process_cq_event_complete(struct hvnd_ep_obj *ep_object, > - int status) > -{ > - struct ib_cq *ibcq; > - struct hvnd_cq *cq; > - int ret; > - > - cq = ep_object->cq; > - ibcq = &ep_object->cq->ibcq; > - > - // call hte previous CQ complete > - if (cq->upcall_pending && ibcq->comp_handler) { > - ibcq->comp_handler(ibcq, ibcq->cq_context); > - cq->upcall_pending = false; > - hvnd_debug("CQ comp_handler called arm_sn=%d\n", cq->arm_sn); > - } > - > - cq->upcall_pending = true; > - if (!ep_add_work_pending(ep_object)) > - return; > - > - ret = hvnd_notify_cq(ep_object->nd_dev, > - ep_object->cq, > - ND_CQ_NOTIFY_ANY, > - (u64)ep_object); > - > - if (ret) { > - ep_del_work_pending(ep_object); > -// hvnd_manage_io_state(ep_object, true); > - hvnd_error("hvnd_notify_cq failed ret=%d\n", ret); > - } > - > - if ((status != 0) && (status != STATUS_CANCELLED)) { > - if (ibcq->event_handler) { > - struct ib_event event; > - > - event.device = ibcq->device; > - event.event = IB_EVENT_CQ_ERR; > - event.element.cq = ibcq; > - ibcq->event_handler(&event, ibcq->cq_context); > - > - hvnd_warn("CQ event_handler called status=%x\n", > - status); > - } > - } > -} > - > -int init_cm_event(struct hvnd_ep_obj *ep_object, struct iw_cm_event *cm_event, > - int event) > -{ > - struct sockaddr_in *laddr = (struct sockaddr_in *)&cm_event->local_addr; > - struct sockaddr_in *raddr = (struct sockaddr_in *)&cm_event->remote_addr; > - struct nd_read_limits rd_limits; > - union nd_sockaddr_inet local_addr; > - union nd_sockaddr_inet remote_addr; > - int ret; > - > - /* > - * Now get the local address. > - */ > - ret = hvnd_connector_get_local_addr(ep_object->nd_dev, > - ep_object->uctx, > - ep_object->ep_handle, > - &local_addr); > - > - if (ret) { > - hvnd_error("Connector get addr failed; ret: %d\n", ret); > - return ret; > - } > - /* > - * Now get the remote address. > - */ > - ret = hvnd_connector_get_peer_addr(ep_object->nd_dev, > - ep_object->uctx, > - ep_object->ep_handle, > - &remote_addr); > - > - if (ret) { > - hvnd_error("Connector get peer addr failed; ret: %d\n", ret); > - return ret; > - } > - > - /* > - * Get other connection parameters. > - */ > - > - ret = hvnd_connector_get_rd_limits(ep_object->nd_dev, > - ep_object->uctx, > - ep_object->ep_handle, > - &rd_limits); > - > - if (ret) { > - hvnd_error("Connector rd limits failed; ret: %d\n", ret); > - return ret; > - } > - > - /* > - * XXXKYS: Luke: What about the length of the priv data? > - */ > - ret = hvnd_connector_get_priv_data(ep_object->nd_dev, > - ep_object->uctx, > - ep_object->ep_handle, > - ep_object->priv_data); > - > - if (ret) { > - hvnd_error("Connector get priv data failed; ret: %d\n", ret); > - return ret; > - } > - /* > - * Initialize CM structure. > - */ > - laddr->sin_addr.s_addr = local_addr.ipv4.sin_addr.s_addr; > - hvnd_debug("Local addr is %d\n", laddr->sin_addr.s_addr); > - laddr->sin_port = local_addr.ipv4.sin_port; > - laddr->sin_family = AF_INET; > - > - raddr->sin_addr.s_addr = remote_addr.ipv4.sin_addr.s_addr; > - hvnd_debug("Remote addr is %d\n", raddr->sin_addr.s_addr); > - raddr->sin_port = remote_addr.ipv4.sin_port; > - raddr->sin_family = AF_INET; > - > - // KYS; LUke: is it always 148 bytes? > - cm_event->private_data_len = MAX_PRIVATE_DATA_LEN; > - cm_event->private_data = ep_object->priv_data; > - > - cm_event->ird = rd_limits.inbound; > - cm_event->ord = rd_limits.outbound; > - cm_event->event = event; > - > - ep_object->ird = cm_event->ird; > - ep_object->ord = cm_event->ord; > - > - return 0; > -} > - > -static void hvnd_refuse_connection(struct hvnd_ep_obj *connector, int status) > -{ > - struct iw_cm_event cm_event; > - > - memset(&cm_event, 0, sizeof(cm_event)); > - > - cm_event.event = IW_CM_EVENT_CONNECT_REPLY; > - cm_event.status = status; > - > - hvnd_debug("returning status %d on connector %p\n", status, connector); > - > - if (connector->cm_id && connector->cm_id->event_handler) { > - connector->cm_id->event_handler(connector->cm_id, &cm_event); > - connector->cm_id->rem_ref(connector->cm_id); > - connector->cm_state = hvnd_cm_connect_reply_refused; > - } > -} > - > -void hvnd_process_events(struct work_struct *work) > -{ > - struct hvnd_work *wrk; > - struct nd_read_limits rd_limits; > - struct hvnd_ep_obj *ep_object; > - struct hvnd_ep_obj *parent; > - struct iw_cm_event cm_event; > - struct sockaddr_in *laddr = (struct sockaddr_in *)&cm_event.local_addr; > - struct sockaddr_in *raddr = (struct sockaddr_in *)&cm_event.remote_addr; > - struct ndv_packet_hdr_control_1 *ctrl_hdr; > - union nd_sockaddr_inet local_addr; > - union nd_sockaddr_inet remote_addr; > - struct pkt_nd_get_connection_listener *connection_pkt; > - struct iw_cm_id *cm_id = NULL; > - int status; > - int ioctl; > - int ret; > - char priv_data[MAX_PRIVATE_DATA_LEN]; > - enum ibv_qp_state new_qp_state; > - struct incoming_pkt *incoming_pkt; > - unsigned long flags; > - > - memset(&cm_event, 0, sizeof(cm_event)); > - memset(&priv_data, 0, MAX_PRIVATE_DATA_LEN); > - > - > - wrk = container_of(work, struct hvnd_work, work); > - > - /* > - * Now call into the connection manager. > - */ > - ep_object = (struct hvnd_ep_obj *)wrk->callback_arg; > - parent = ep_object->parent; > - > -process_next: > - incoming_pkt = NULL; > - spin_lock_irqsave(&ep_object->incoming_pkt_list_lock, flags); > - if (!list_empty(&ep_object->incoming_pkt_list)) { > - incoming_pkt = list_first_entry(&ep_object->incoming_pkt_list, > - struct incoming_pkt, > - list_entry); > - list_del(&incoming_pkt->list_entry); > - } > - spin_unlock_irqrestore(&ep_object->incoming_pkt_list_lock, flags); > - if (incoming_pkt == NULL) > - return; > - > - ctrl_hdr = (struct ndv_packet_hdr_control_1 *)incoming_pkt->pkt; > - status = ctrl_hdr->io_status; > - ioctl = ctrl_hdr->io_cntrl_code; > - > - hvnd_debug("Process Events IOCTL is: %s; iostatus failure: %x in work queue\n", > - hvnd_get_op_name(ioctl), status); > - > - if (status != 0) { > - bool log_error = true; > - > - if (ioctl == IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT && > - status == STATUS_DISCONNECTED) // expected > - log_error = false; > - > - if (log_error) > - hvnd_warn("Process Events IOCTL is: %s; iostatus failure: %x\n", > - hvnd_get_op_name(ioctl), status); > - } > - > - cm_event.status = status; > - > - switch (ep_object->type) { > - case ND_CONNECTOR: > - switch (ioctl) { > - case IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST: > - > - if (ep_object->parent != NULL) { > - > - // Do nothing with this connection request if > - // listener is stopping > - if (!ep_add_work_pending(ep_object->parent)) > - break; > - > - cm_id = ep_object->parent->cm_id; //Listener > - } > - > - connection_pkt = (struct pkt_nd_get_connection_listener *) ctrl_hdr; > - > - if ((status == 0) || (status == STATUS_CANCELLED)) { > - hvnd_get_incoming_connections(ep_object->parent, > - ep_object->parent->nd_dev, > - ep_object->uctx); > - } > - > - if (status) > - goto get_connection_request_done; > - > - /* > - * Now get the local address. > - */ > - ret = hvnd_connector_get_local_addr(ep_object->nd_dev, > - ep_object->uctx, > - ep_object->ep_handle, > - &local_addr); > - > - if (ret) { > - hvnd_error("Connector get addr failed; ret: %d\n", > - ret); > - goto get_connection_request_done; > - } > - /* > - * Now get the remote address. > - */ > - ret = hvnd_connector_get_peer_addr(ep_object->nd_dev, > - ep_object->uctx, > - ep_object->ep_handle, > - &remote_addr); > - > - if (ret) { > - hvnd_error("Connector get peer addr failed; ret: %d\n", > - ret); > - goto get_connection_request_done; > - } > - /* > - * Get other connection parameters. > - */ > - > - ret = hvnd_connector_get_rd_limits(ep_object->nd_dev, > - ep_object->uctx, > - ep_object->ep_handle, > - &rd_limits); > - > - if (ret) { > - hvnd_error("Connector rd imits failed; ret: %d\n", > - ret); > - goto get_connection_request_done; > - } > - > - /* > - * XXXKYS: Luke: What about the length of the priv data? > - */ > - ret = hvnd_connector_get_priv_data( > - ep_object->nd_dev, ep_object->uctx, > - ep_object->ep_handle, ep_object->priv_data); > - > - if (ret) { > - hvnd_error("Connector get priv data failed; ret: %d\n", > - ret); > - goto get_connection_request_done; > - } > - > - cm_event.event = IW_CM_EVENT_CONNECT_REQUEST; > - cm_event.provider_data = (void *)ep_object; > - > - laddr->sin_addr.s_addr = local_addr.ipv4.sin_addr.s_addr; > - hvnd_debug("Local addr is %d\n", > - laddr->sin_addr.s_addr); > - laddr->sin_port = local_addr.ipv4.sin_port; > - laddr->sin_family = AF_INET; > - > - raddr->sin_addr.s_addr = remote_addr.ipv4.sin_addr.s_addr; > - hvnd_debug("Remote addr is %d\n", > - raddr->sin_addr.s_addr); > - raddr->sin_port = remote_addr.ipv4.sin_port; > - raddr->sin_family = AF_INET; > - > - // KYS; LUke: is it always 148 bytes? > - cm_event.private_data_len = MAX_PRIVATE_DATA_LEN; > - cm_event.private_data = ep_object->priv_data; > - > - cm_event.ird = rd_limits.inbound; > - cm_event.ord = rd_limits.outbound; > - > - ep_object->ird = cm_event.ird; > - ep_object->ord = cm_event.ord; > - > - > - if ((cm_id != NULL) && cm_id->event_handler) { > - cm_id->event_handler(cm_id, &cm_event); > - ep_object->cm_state = hvnd_cm_connect_request_sent; > - } > - > -get_connection_request_done: > - if (ep_object->parent != NULL) > - ep_del_work_pending(ep_object->parent); > - break; > - > - case IOCTL_ND_CONNECTOR_CONNECT: > - > - cm_event.event = IW_CM_EVENT_CONNECT_REPLY; > - if (status == STATUS_TIMEOUT && > - ep_object->connector_connect_retry < 3) { > - //TIMEOUT retry > - > - if (!ep_add_work_pending(ep_object)) > - goto refuse_connection; > - > - hvnd_warn("Connector connect timed out, reconnecting... retry count: %d\n", > - ep_object->connector_connect_retry); > - ep_object->connector_connect_retry++; > - ret = hvnd_send_ioctl_pkt(ep_object->nd_dev, > - &ep_object->connector_connect_pkt.hdr, > - sizeof(ep_object->connector_connect_pkt), > - (u64)&ep_object->connector_connect_pkt); > - > - if (ret) { > - hvnd_error("Connector on time out failed: %d\n", > - ret); > - ep_del_work_pending(ep_object); > - goto refuse_connection; > - } > - break; > - } > - > -refuse_connection: > - if (status) { > - cm_event.status = -ECONNREFUSED; > - if (status == STATUS_TIMEOUT) > - cm_event.status = -ETIMEDOUT; > - > - hvnd_refuse_connection(ep_object, > - cm_event.status); > - break; > - } > - > - hvnd_debug("ACTIVE Connection ACCEPTED\n"); > - ret = init_cm_event(ep_object, &cm_event, > - IW_CM_EVENT_CONNECT_REPLY); > - if (ret) { > - hvnd_error("init_cm_event failed ret=%d\n", > - ret); > - goto process_done; > - } > - > - ret = hvnd_connector_complete_connect(ep_object->nd_dev, > - ep_object->uctx, > - ep_object->ep_handle, > - &new_qp_state); > - if (ret) { > - hvnd_error("connector_complete failed\n"); > - goto process_done; > - } > - > - cm_event.provider_data = (void *)ep_object; > - > - if ((ep_object->cm_id) && > - (ep_object->cm_id->event_handler)) { > - ep_object->cm_id->event_handler( > - ep_object->cm_id, &cm_event); > - ep_object->cm_state = hvnd_cm_connect_reply_sent; > - } > - /* > - * Rquest notification if the other end > - * were to disconnect. > - */ > - if (!ep_add_work_pending(ep_object)) > - goto process_done; > - > - ret = hvnd_connector_notify_disconnect( > - ep_object->nd_dev, ep_object->uctx, > - ep_object->ep_handle, ep_object); > - > - if (ret) { > - ep_del_work_pending(ep_object); > - hvnd_error("Connector notify disconnect failed; ret: %d\n", > - ret); > - } > - > - break; > - > - case IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT: > - hvnd_process_notify_disconnect(ep_object, status); > - break; > - > - > - default: > - hvnd_error("Unknown Connector IOCTL\n"); > - break; > - } > - break; > - default: > - hvnd_error("Unknown endpoint object\n"); > - break; > - } > -process_done: > - kfree(incoming_pkt); > - ep_del_work_pending(ep_object); > - > - goto process_next; > -} > - > - > -static struct hvnd_ep_obj *hvnd_setup_ep(struct iw_cm_id *cm_id, int ep_type, > - struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx) > -{ > - struct hvnd_ep_obj *ep_object; > - int ret; > - > - ep_object = kzalloc(sizeof(struct hvnd_ep_obj), GFP_KERNEL); > - > - if (!ep_object) > - return NULL; > - > - ret = hvnd_init_ep(ep_object, cm_id, ep_type, nd_dev, uctx); > - > - if (ret) { > - hvnd_error("hvnd_init_ep failed ret=%d\n", ret); > - kfree(ep_object); > - return NULL; > - } > - > - return ep_object; > -} > - > -/* > - * return value: > - * true: I/O state is stopped, we should not do upcall > - * flase: I/O state is running and normal > - */ > -#if 0 > -static bool hvnd_manage_io_state(struct hvnd_ep_obj *ep, bool failure) > -{ > - unsigned long flags; > - > - spin_lock_irqsave(&ep->ep_lk, flags); > - if (ep->to_be_destroyed) { > - hvnd_warn("ep being destroyed\n"); > - if (ep->io_outstanding) { > - hvnd_warn("ep being destroyed i/O pending waking up on %p\n", > - &ep->block_event); > - complete(&ep->block_event); > - ep->io_outstanding = false; > - } > - spin_unlock_irqrestore(&ep->ep_lk, flags); > - return true; > - } > - if (!failure) > - ep->io_outstanding = true; > - spin_unlock_irqrestore(&ep->ep_lk, flags); > - return false; > -} > -#endif > - > -static int hvnd_get_incoming_connections(struct hvnd_ep_obj *listener, > - struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx) > -{ > - struct hvnd_ep_obj *connector; > - u64 connector_handle; > - int ret; > - > - /* > - * First handle the protocol for > - * destruction - outstanding I/O. > - */ > - > -// if (hvnd_manage_io_state(listener, false)) > -// return 0; > - /* > - * Create a connector. > - */ > - connector = hvnd_setup_ep(listener->cm_id, ND_CONNECTOR, nd_dev, uctx); > - if (!connector) { > - hvnd_error("hvnd_setup_ep failed\n"); > - ret = -ENOMEM; > - goto con_alloc_err; > - } > - > - ret = hvnd_cr_connector(nd_dev, uctx, > - &connector_handle); > - if (ret) { > - hvnd_error("hvnd_cr_connector failed ret=%d\n", ret); > - goto con_cr_err; > - } > - > - /* > - * Now get a connection if one is pending. > - */ > - connector->ep_handle = connector_handle; > - connector->parent = listener; > - > - if (!ep_add_work_pending(connector)) > - goto get_connection_err; > - > - ret = hvnd_get_connection_listener(nd_dev, uctx, > - listener->ep_handle, > - connector_handle, > - (u64)connector); > - > - if (ret) { > - hvnd_debug("listener_get_connection failed\n"); > - ep_del_work_pending(connector); > - goto get_connection_err; > - } > - > - hvnd_acquire_uctx_ref(uctx); > - listener->outstanding_handle = connector_handle; > - listener->outstanding_ep = connector; > - hvnd_debug("outstanding handle is %p\n", (void *)connector_handle); > - return 0; > - > -get_connection_err: > - hvnd_free_handle(nd_dev, uctx, > - connector_handle, > - IOCTL_ND_CONNECTOR_FREE); > - > -con_cr_err: > - kfree(connector); > -con_alloc_err: > -// hvnd_manage_io_state(listener, true); > - return ret; > -} > - > -static int hvnd_create_listen(struct iw_cm_id *cm_id, int backlog) > -{ > - int ret = 0; > - struct hvnd_dev *nd_dev; > - struct hvnd_ucontext *uctx; > - struct hvnd_ep_obj *ep_object; > - union nd_sockaddr_inet addr; > - union nd_sockaddr_inet local_addr; > - u64 listener_handle; > - struct sockaddr_in *laddr = (struct sockaddr_in *)&cm_id->local_addr; > - union nd_sockaddr_inet og_addr; > - > - > - nd_dev = to_nd_dev(cm_id->device); > - uctx = get_uctx(nd_dev, current_pid()); > - hvnd_debug("uctx is %p; pid is %d\n", uctx, current_pid()); > - > - if (cm_id->local_addr.ss_family != AF_INET) { > - hvnd_error("cm_id->local_addr.ss_family =%d not AF_INET\n", > - cm_id->local_addr.ss_family); > - return -ENOSYS; > - } > - > - /* > - * If the local address is LOOPBACK or INADDR_ANY, get an an address > - * to bind the listener. For now, just get the first address > - * available. > - */ > - > - if (IN_LOOPBACK(ntohl(laddr->sin_addr.s_addr)) || > - (laddr->sin_addr.s_addr == INADDR_ANY)) { > - > - hvnd_debug("need to get an address\n"); > - ret = hvnd_get_outgoing_rdma_addr(nd_dev, uctx, &og_addr); > - > - if (ret) { > - hvnd_error("failed to get the og address\n"); > - return ret; > - } > - > - laddr->sin_addr.s_addr = og_addr.ipv4.sin_addr.s_addr; > - } > - > - cm_id->add_ref(cm_id); > - > - ep_object = hvnd_setup_ep(cm_id, ND_LISTENER, nd_dev, uctx); > - > - if (!ep_object) { > - hvnd_error("hvnd_setup_ep returned NULL\n"); > - goto alloc_err; > - } > - > - ret = hvnd_cr_listener(nd_dev, uctx, > - &listener_handle); > - if (ret) { > - hvnd_error("hvnd_cr_listener failed ret=%d\n", ret); > - goto cr_err; > - } > - > - ep_object->ep_handle = listener_handle; > - > - cm_id->provider_data = ep_object; > - > - /* > - * Now bind the listener. > - * IPV4 support only. > - */ > - memcpy(&addr.ipv4, laddr, sizeof(struct sockaddr_in)); > - > - ret = hvnd_bind_listener(nd_dev, uctx, > - listener_handle, > - &addr); > - if (ret) { > - hvnd_error("hvnd_bind_listener failed ret=%d\n", ret); > - goto bind_err; > - } > - > - /* > - * Now get the local address. > - */ > - ret = hvnd_get_addr_listener(nd_dev, uctx, > - listener_handle, > - &local_addr); > - if (ret) { > - hvnd_error("hvnd_get_addr_listener failed ret=%d\n", ret); > - goto bind_err; > - } > - > - /* > - * Now put the listener in the listen mode. > - */ > - > - ret = hvnd_listen_listener(nd_dev, uctx, > - listener_handle, > - backlog); > - > - if (ret) { > - hvnd_error("hvnd_listen_listener failed ret=%d\n", ret); > - goto bind_err; > - } > - > - > - /* > - * Now get a pending connection if one is pending. > - */ > - ret = hvnd_get_incoming_connections(ep_object, nd_dev, uctx); > - if (ret) { > - hvnd_error("hvnd_get_incoming_connections failed ret=%d\n", > - ret); > - goto bind_err; > - } > - > - hvnd_acquire_uctx_ref(uctx); > - hvnd_debug("cm_id=%p\n", cm_id); > - return 0; > - > -bind_err: > - hvnd_free_handle(nd_dev, uctx, > - listener_handle, > - IOCTL_ND_LISTENER_FREE); > -cr_err: > - kfree(ep_object); > -alloc_err: > - cm_id->provider_data = NULL; > - cm_id->rem_ref(cm_id); > - return ret; > -} > - > -static int hvnd_destroy_listen(struct iw_cm_id *cm_id) > -{ > - struct hvnd_dev *nd_dev; > - struct hvnd_ucontext *uctx; > - struct hvnd_ep_obj *ep_object; > - > - nd_dev = to_nd_dev(cm_id->device); > - > - ep_object = (struct hvnd_ep_obj *)cm_id->provider_data; > - > - hvnd_debug("uctx is %p\n", ep_object->uctx); > - hvnd_debug("Destroying Listener cm_id=%p\n", cm_id); > - uctx = ep_object->uctx; > - > - // make sure there is nothing in progress on this ep > - ep_stop(ep_object); > - > - hvnd_free_handle(nd_dev, uctx, > - ep_object->ep_handle, > - IOCTL_ND_LISTENER_FREE); > - > - /* > - * We may have an ouststanding connector for > - * incoming connection requests; clean it up. > - */ > - > - if (ep_object->outstanding_handle != 0) { > - > - // make sure there is nothing in progress on this ep > - ep_stop(ep_object->outstanding_ep); > - > - hvnd_free_handle(nd_dev, uctx, > - ep_object->outstanding_handle, > - IOCTL_ND_CONNECTOR_FREE); > - > - > - hvnd_drop_uctx_ref(nd_dev, uctx); > - hvnd_destroy_ep(ep_object->outstanding_ep); > - } > - > - /* > - * Now everything should have stopped > - */ > - > - cm_id->rem_ref(cm_id); > - hvnd_destroy_ep(ep_object); > - cm_id->provider_data = NULL; > - hvnd_drop_uctx_ref(nd_dev, uctx); > - > - hvnd_debug("cm_id=%p\n", cm_id); > - return 0; > -} > - > -static void hvnd_qp_add_ref(struct ib_qp *ibqp) > -{ > - struct hvnd_qp *qp = to_nd_qp(ibqp); > - > - atomic_inc(&qp->refcnt); > -} > - > -void hvnd_qp_rem_ref(struct ib_qp *ibqp) > -{ > - struct hvnd_qp *qp = to_nd_qp(ibqp); > - > - if (atomic_dec_and_test(&qp->refcnt)) > - wake_up(&qp->wait); > -} > - > -static DEVICE_ATTR(hw_rev, 0444, hvnd_show_rev, NULL); > -static DEVICE_ATTR(fw_ver, 0444, hvnd_show_fw_ver, NULL); > -static DEVICE_ATTR(hca_type, 0444, hvnd_show_hca, NULL); > -static DEVICE_ATTR(board_id, 0444, hvnd_show_board, NULL); > - > -static struct device_attribute *hvnd_class_attributes[] = { > - &dev_attr_hw_rev, > - &dev_attr_fw_ver, > - &dev_attr_hca_type, > - &dev_attr_board_id, > -}; > - > -int hvnd_register_device(struct hvnd_dev *dev, char *ip_addr, char *mac_addr) > -{ > - int ret; > - > - dev->ibdev.owner = THIS_MODULE; > - dev->device_cap_flags = IB_DEVICE_LOCAL_DMA_LKEY | IB_DEVICE_MEM_WINDOW; > - dev->ibdev.local_dma_lkey = 0; > - dev->ibdev.uverbs_cmd_mask = > - (1ull << IB_USER_VERBS_CMD_GET_CONTEXT) | > - (1ull << IB_USER_VERBS_CMD_QUERY_DEVICE) | > - (1ull << IB_USER_VERBS_CMD_QUERY_PORT) | > - (1ull << IB_USER_VERBS_CMD_ALLOC_PD) | > - (1ull << IB_USER_VERBS_CMD_DEALLOC_PD) | > - (1ull << IB_USER_VERBS_CMD_REG_MR) | > - (1ull << IB_USER_VERBS_CMD_DEREG_MR) | > - (1ull << IB_USER_VERBS_CMD_CREATE_COMP_CHANNEL) | > - (1ull << IB_USER_VERBS_CMD_CREATE_CQ) | > - (1ull << IB_USER_VERBS_CMD_DESTROY_CQ) | > - (1ull << IB_USER_VERBS_CMD_REQ_NOTIFY_CQ) | > - (1ull << IB_USER_VERBS_CMD_CREATE_QP) | > - (1ull << IB_USER_VERBS_CMD_MODIFY_QP) | > - (1ull << IB_USER_VERBS_CMD_QUERY_QP) | > - (1ull << IB_USER_VERBS_CMD_POLL_CQ) | > - (1ull << IB_USER_VERBS_CMD_DESTROY_QP) | > - (1ull << IB_USER_VERBS_CMD_POST_SEND) | > - (1ull << IB_USER_VERBS_CMD_POST_RECV); > - dev->ibdev.node_type = RDMA_NODE_RNIC; > - memcpy(dev->ibdev.node_desc, HVND_NODE_DESC, sizeof(HVND_NODE_DESC)); > - memcpy(&dev->ibdev.node_guid, mac_addr, 6); > - dev->ibdev.phys_port_cnt = 1; //dev->nports; > - dev->ibdev.num_comp_vectors = 1; > - dev->ibdev.query_device = hvnd_query_device; > - dev->ibdev.query_port = hvnd_query_port; > - dev->ibdev.get_link_layer = hvnd_get_link_layer; > - dev->ibdev.query_pkey = hvnd_query_pkey; > - dev->ibdev.query_gid = hvnd_query_gid; > - dev->ibdev.alloc_ucontext = hvnd_alloc_ucontext; > - dev->ibdev.dealloc_ucontext = hvnd_dealloc_ucontext; > - dev->ibdev.mmap = hvnd_mmap; > - dev->ibdev.alloc_pd = hvnd_allocate_pd; > - dev->ibdev.dealloc_pd = hvnd_deallocate_pd; > - dev->ibdev.create_ah = hvnd_ah_create; > - dev->ibdev.destroy_ah = hvnd_ah_destroy; > - dev->ibdev.create_qp = hvnd_ib_create_qp; > - dev->ibdev.modify_qp = hvnd_ib_modify_qp; > - dev->ibdev.query_qp = hvnd_ib_query_qp; > - dev->ibdev.destroy_qp = hvnd_destroy_qp; > - dev->ibdev.create_cq = hvnd_ib_create_cq; > - dev->ibdev.destroy_cq = hvnd_ib_destroy_cq; > - dev->ibdev.resize_cq = hvnd_resize_cq; > - dev->ibdev.poll_cq = hvnd_poll_cq; > - dev->ibdev.get_dma_mr = hvnd_get_dma_mr; > - dev->ibdev.reg_user_mr = hvnd_reg_user_mr; > - dev->ibdev.dereg_mr = hvnd_dereg_mr; > - dev->ibdev.alloc_mw = hvnd_alloc_mw; > - dev->ibdev.dealloc_mw = hvnd_dealloc_mw; > - dev->ibdev.attach_mcast = hvnd_multicast_attach; > - dev->ibdev.detach_mcast = hvnd_multicast_detach; > - dev->ibdev.process_mad = hvnd_process_mad; > - dev->ibdev.req_notify_cq = hvnd_arm_cq; > - dev->ibdev.post_send = hvnd_post_send; > - dev->ibdev.post_recv = hvnd_post_receive; > - dev->ibdev.uverbs_abi_ver = MLX4_IB_UVERBS_ABI_VERSION; > - > - dev->ibdev.get_port_immutable = hvnd_get_port_immutable; > - > - //DMA ops for mapping all possible addresses > - dev->ibdev.dev.parent = &(dev->hvdev->device); > - dev->ibdev.dev.dma_ops = &vmbus_dma_ops; > - dev->ibdev.dev.dma_mask = (u64 *) DMA_BIT_MASK(64); > - dev->ibdev.dev.coherent_dma_mask = DMA_BIT_MASK(64); > - > - dev->ibdev.iwcm = kmalloc(sizeof(struct iw_cm_verbs), GFP_KERNEL); > - if (!dev->ibdev.iwcm) > - return -ENOMEM; > - > - dev->ibdev.iwcm->connect = hvnd_connect; > - dev->ibdev.iwcm->accept = hvnd_accept_cr; > - dev->ibdev.iwcm->reject = hvnd_reject_cr; > - dev->ibdev.iwcm->create_listen = hvnd_create_listen; > - dev->ibdev.iwcm->destroy_listen = hvnd_destroy_listen; > - dev->ibdev.iwcm->add_ref = hvnd_qp_add_ref; > - dev->ibdev.iwcm->rem_ref = hvnd_qp_rem_ref; > - dev->ibdev.iwcm->get_qp = hvnd_get_qp; > - > - strlcpy(dev->ibdev.name, "mlx4_%d", IB_DEVICE_NAME_MAX); > - ret = ib_register_device(&dev->ibdev, NULL); > - if (ret) { > - hvnd_error("ib_register_device failed ret=%d\n", ret); > - goto bail1; > - } > - > -#if 0 > - for (i = 0; i < ARRAY_SIZE(hvnd_class_attributes); ++i) { > - ret = device_create_file(&dev->ibdev.dev, > - hvnd_class_attributes[i]); > - if (ret) { > - hvnd_error("device_create_file failed ret=%d\n", ret); > - goto bail2; > - } > - } > -#endif > - > - dev->ib_active = true; > - return 0; > -//bail2: > -// ib_unregister_device(&dev->ibdev); > -bail1: > - kfree(dev->ibdev.iwcm); > - return ret; > -} > - > -void hvnd_unregister_device(struct hvnd_dev *dev) > -{ > - int i; > - > - for (i = 0; i < ARRAY_SIZE(hvnd_class_attributes); ++i) > - device_remove_file(&dev->ibdev.dev, > - hvnd_class_attributes[i]); > - ib_unregister_device(&dev->ibdev); > - kfree(dev->ibdev.iwcm); > - ib_dealloc_device((struct ib_device *)dev); > -} > - > -static int hvnd_try_bind_nic(unsigned char *mac, __be32 ip) > -{ > - int ret; > - struct hvnd_dev *nd_dev = g_nd_dev; > - > - mutex_lock(&nd_dev->bind_mutex); > - if (nd_dev->bind_complete) { > - mutex_unlock(&nd_dev->bind_mutex); > - return 1; > - } > - > - memcpy(nd_dev->mac_addr, mac, 6); > - *(__be32 *)(nd_dev->ip_addr) = ip; > - > - /* > - * Bind the NIC. > - */ > - hvnd_info("trying to bind to IP %pI4 MAC %pM\n", > - nd_dev->ip_addr, nd_dev->mac_addr); > - ret = hvnd_bind_nic(nd_dev, false, nd_dev->ip_addr, nd_dev->mac_addr); > - if (ret || nd_dev->bind_pkt.pkt_hdr.status) { > - mutex_unlock(&nd_dev->bind_mutex); > - return 1; > - } > - > - /* if we reach here, this means bind_nic is a success */ > - hvnd_error("successfully bound to IP %pI4 MAC %pM\n", > - nd_dev->ip_addr, nd_dev->mac_addr); > - complete(&nd_dev->addr_set); > - nd_dev->bind_complete = 1; > - mutex_unlock(&nd_dev->bind_mutex); > - > - ret = hvnd_register_device(nd_dev, nd_dev->ip_addr, nd_dev->mac_addr); > - > - if (!ret) > - return 0; > - > - hvnd_error("hvnd_register_device failed ret=%d\n", ret); > - > - /* roll back all allocated resources on error */ > - iounmap(nd_dev->mmio_virt); > - release_resource(&nd_dev->mmio_resource); > - > - vmbus_close(nd_dev->hvdev->channel); > - ib_dealloc_device((struct ib_device *)nd_dev); > - > - return 1; > -} > - > -static void hvnd_inetaddr_event_up(unsigned long event, struct in_ifaddr *ifa) > -{ > - hvnd_try_bind_nic(ifa->ifa_dev->dev->dev_addr, ifa->ifa_address); > -} > - > -static int hvnd_inetaddr_event(struct notifier_block *notifier, > - unsigned long event, void *ptr) > -{ > - struct in_ifaddr *ifa = ptr; > - > - switch (event) { > - case NETDEV_UP: > - hvnd_inetaddr_event_up(event, ifa); > - break; > - default: > - hvnd_debug("Received inetaddr event %lu\n", event); > - } > - > - return NOTIFY_DONE; > -} > - > -static struct notifier_block hvnd_inetaddr_notifier = { > - .notifier_call = hvnd_inetaddr_event, > -}; > - > -static int start_bind_nic(void) > -{ > - struct net_device *dev; > - struct in_device *idev; > - struct in_ifaddr *ifa; > - > - register_inetaddr_notifier(&hvnd_inetaddr_notifier); > - > - rtnl_lock(); > - for_each_netdev(&init_net, dev) { > - idev = in_dev_get(dev); > - if (!idev) > - continue; > - for (ifa = (idev)->ifa_list; > - ifa && !(ifa->ifa_flags&IFA_F_SECONDARY); > - ifa = ifa->ifa_next) { > - hvnd_try_bind_nic(dev->dev_addr, ifa->ifa_address); > - } > - } > - rtnl_unlock(); > - > - return 0; > -} > - > -static int hvnd_probe(struct hv_device *dev, > - const struct hv_vmbus_device_id *dev_id) > -{ > - struct hvnd_dev *nd_dev; > - int ret = 0; > - > - hvnd_debug("hvnd starting\n"); > - > - nd_dev = (struct hvnd_dev *)ib_alloc_device(sizeof(struct hvnd_dev)); > - if (!nd_dev) { > - ret = -ENOMEM; > - goto err_out0; > - } > - > - nd_dev->hvdev = dev; > - /* > - * We are going to masquerade as MLX4 device; > - * Set the vendor and device ID accordingly. > - */ > - dev->vendor_id = 0x15b3; //Mellanox > - dev->device_id = 0x1003; //Mellanox HCA > - INIT_LIST_HEAD(&nd_dev->listentry); > - spin_lock_init(&nd_dev->uctxt_lk); > - nd_dev->ib_active = false; > - > - /* > - * Initialize the state for the id table. > - */ > - spin_lock_init(&nd_dev->id_lock); > - idr_init(&nd_dev->cqidr); > - idr_init(&nd_dev->qpidr); > - idr_init(&nd_dev->mmidr); > - idr_init(&nd_dev->irpidr); > - idr_init(&nd_dev->uctxidr); > - > - atomic_set(&nd_dev->open_cnt, 0); > - > - sema_init(&nd_dev->query_pkt_sem, 1); > - > - ret = vmbus_open(dev->channel, HVND_RING_SZ, HVND_RING_SZ, NULL, 0, > - hvnd_callback, dev); > - > - if (ret) { > - hvnd_error("vmbus_open failed ret=%d\n", ret); > - goto err_out1; > - } > - > - hv_set_drvdata(dev, nd_dev); > - > - ret = hvnd_negotiate_version(nd_dev); > - > - if (ret) { > - hvnd_error("hvnd_negotiate_version failed ret=%d\n", ret); > - goto err_out2; > - } > - > - /* > - * Register resources with the host. > - */ > - ret = hvnd_init_resources(nd_dev); > - if (ret) { > - hvnd_error("hvnd_init_resources failed ret=%d\n", ret); > - goto err_out2; > - } > - > - /* > - * Try to bind every NIC to ND channel, > - * ND host will only return success for the correct one > - */ > - nd_dev->bind_complete = 0; > - mutex_init(&nd_dev->bind_mutex); > - init_completion(&nd_dev->addr_set); > - > - g_nd_dev = nd_dev; > - start_bind_nic(); > - > - return 0; > - > -err_out2: > - vmbus_close(dev->channel); > - > -err_out1: > - ib_dealloc_device((struct ib_device *)nd_dev); > - > -err_out0: > - return ret; > -} > - > -static int hvnd_remove(struct hv_device *dev) > -{ > - struct hvnd_dev *nd_dev = hv_get_drvdata(dev); > - > - unregister_inetaddr_notifier(&hvnd_inetaddr_notifier); > - hvnd_bind_nic(nd_dev, true, nd_dev->ip_addr, nd_dev->mac_addr); > - hvnd_unregister_device(nd_dev); > - vmbus_close(dev->channel); > - iounmap(nd_dev->mmio_virt); > - release_resource(&nd_dev->mmio_resource); > - return 0; > -} > - > -static const struct hv_vmbus_device_id id_table[] = { > - /* VMBUS RDMA class guid */ > - /* 8c2eaf3d-32a7-4b09-ab99-bd1f1c86b501 */ > - { HV_ND_GUID, }, > - { }, > -}; > - > -MODULE_DEVICE_TABLE(vmbus, id_table); > - > -static struct hv_driver hvnd_drv = { > - .name = "hv_guest_rdma", > - .id_table = id_table, > - .probe = hvnd_probe, > - .remove = hvnd_remove, > -}; > - > - > -static int __init init_hvnd_drv(void) > -{ > - > - pr_info("Registered HyperV networkDirect Driver\n"); > - hvnd_addr_init(); > - return vmbus_driver_register(&hvnd_drv); > -} > - > -static void exit_hvnd_drv(void) > -{ > - pr_info("De-Registered HyperV networkDirect Driver\n"); > - hvnd_addr_deinit(); > - vmbus_driver_unregister(&hvnd_drv); > -} > - > - > -module_init(init_hvnd_drv); > -module_exit(exit_hvnd_drv); > - > -MODULE_DESCRIPTION("Hyper-V NetworkDirect Driver"); > -MODULE_LICENSE("GPL"); > -MODULE_VERSION("4.1.3"); > diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/user.h b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/user.h > deleted file mode 100644 > index 07e6769ef43b..000000000000 > --- a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/user.h > +++ /dev/null > @@ -1,107 +0,0 @@ > -/* > - * Copyright (c) 2007 Cisco Systems, Inc. All rights reserved. > - * Copyright (c) 2007, 2008 Mellanox Technologies. All rights reserved. > - * > - * This software is available to you under a choice of one of two > - * licenses. You may choose to be licensed under the terms of the GNU > - * General Public License (GPL) Version 2, available from the file > - * COPYING in the main directory of this source tree, or the > - * OpenIB.org BSD license below: > - * > - * Redistribution and use in source and binary forms, with or > - * without modification, are permitted provided that the following > - * conditions are met: > - * > - * - Redistributions of source code must retain the above > - * copyright notice, this list of conditions and the following > - * disclaimer. > - * > - * - Redistributions in binary form must reproduce the above > - * copyright notice, this list of conditions and the following > - * disclaimer in the documentation and/or other materials > - * provided with the distribution. > - * > - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, > - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF > - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND > - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS > - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN > - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN > - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE > - * SOFTWARE. > - */ > - > -#ifndef MLX4_IB_USER_H > -#define MLX4_IB_USER_H > - > -#include <linux/types.h> > - > -/* > - * Increment this value if any changes that break userspace ABI > - * compatibility are made. > - */ > - > -#define MLX4_IB_UVERBS_NO_DEV_CAPS_ABI_VERSION 3 > -#define MLX4_IB_UVERBS_ABI_VERSION 4 > - > -/* > - * Make sure that all structs defined in this file remain laid out so > - * that they pack the same way on 32-bit and 64-bit architectures (to > - * avoid incompatibility between 32-bit userspace and 64-bit kernels). > - * In particular do not use pointer types -- pass pointers in __u64 > - * instead. > - */ > - > -struct mlx4_ib_alloc_ucontext_resp_v3 { > - __u32 qp_tab_size; > - __u16 bf_reg_size; > - __u16 bf_regs_per_page; > -}; > - > -struct mlx4_ib_alloc_ucontext_resp { > - __u32 dev_caps; > - __u32 qp_tab_size; > - __u16 bf_reg_size; > - __u16 bf_regs_per_page; > - __u32 cqe_size; > -}; > - > -struct mlx4_ib_alloc_pd_resp { > - __u32 pdn; > - __u32 reserved; > -}; > - > -struct mlx4_ib_create_cq { > - __u64 buf_addr; > - __u64 db_addr; > -}; > - > -struct mlx4_ib_create_cq_resp { > - __u32 cqn; > - __u32 reserved; > -}; > - > -struct mlx4_ib_resize_cq { > - __u64 buf_addr; > -}; > - > -struct mlx4_ib_create_srq { > - __u64 buf_addr; > - __u64 db_addr; > -}; > - > -struct mlx4_ib_create_srq_resp { > - __u32 srqn; > - __u32 reserved; > -}; > - > -struct mlx4_ib_create_qp { > - __u64 buf_addr; > - __u64 db_addr; > - __u8 log_sq_bb_count; > - __u8 log_sq_stride; > - __u8 sq_no_prefetch; > - __u8 reserved[5]; > -}; > - > -#endif /* MLX4_IB_USER_H */ > diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/vmbus_rdma.c b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/vmbus_rdma.c > deleted file mode 100644 > index daf68dcccb1b..000000000000 > --- a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/vmbus_rdma.c > +++ /dev/null > @@ -1,3074 +0,0 @@ > -/* > - * Copyright (c) 2014, Microsoft Corporation. > - * > - * Author: > - * K. Y. Srinivasan <kys@microsoft.com> > - * > - * This program is free software; you can redistribute it and/or modify it > - * under the terms of the GNU General Public License version 2 as published > - * by the Free Software Foundation. > - * > - * This program is distributed in the hope that it will be useful, but > - * WITHOUT ANY WARRANTY; without even the implied warranty of > - * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or > - * NON INFRINGEMENT. See the GNU General Public License for more > - * details. > - * > - * Bug fixes/enhancements: Long Li <longli@microsoft.com> > - */ > - > -#include <linux/completion.h> > -#include <linux/module.h> > -#include <linux/errno.h> > -#include <linux/hyperv.h> > -#include <linux/efi.h> > -#include <linux/slab.h> > -#include <linux/cred.h> > -#include <linux/sched.h> > -#include <linux/types.h> > -#include <linux/scatterlist.h> > -#include <rdma/ib_umem.h> > -#include <rdma/ib_user_verbs.h> > -#include <asm-generic/delay.h> > - > -#include "vmbus_rdma.h" > - > -/* > - * We only have a single rdma device on the host; > - * have a single receive buffer. > - */ > - > - > -static char hvnd_recv_buffer[PAGE_SIZE * 4]; > - > -static atomic_t irp_local_hdl; > - > -char *hvnd_get_op_name(int ioctl) > -{ > - switch (ioctl) { > - case IOCTL_ND_PROVIDER_INIT: > - return "IOCTL_ND_PROVIDER_INIT\n"; > - case IOCTL_ND_PROVIDER_BIND_FILE: > - return "IOCTL_ND_PROVIDER_BIND_FILE\n"; > - case IOCTL_ND_ADAPTER_OPEN: > - return "IOCTL_ND_ADAPTER_OPEN\n"; > - > - case IOCTL_ND_ADAPTER_CLOSE: > - return "IOCTL_ND_ADAPTER_CLOSE\n"; > - > - case IOCTL_ND_ADAPTER_QUERY: > - return "IOCTL_ND_ADAPTER_QUERY\n"; > - > - case IOCTL_ND_PD_CREATE: > - return "IOCTL_ND_PD_CREATE\n"; > - > - case IOCTL_ND_PD_FREE: > - return "IOCTL_ND_PD_FREE\n"; > - > - case IOCTL_ND_CQ_CREATE: > - return "IOCTL_ND_CQ_CREATE\n"; > - > - case IOCTL_ND_CQ_FREE: > - return "IOCTL_ND_CQ_FREE\n"; > - case IOCTL_ND_CQ_CANCEL_IO: > - return "IOCTL_ND_CQ_CANCEL_IO\n"; > - case IOCTL_ND_CQ_GET_AFFINITY: > - return "IOCTL_ND_CQ_GET_AAFINITY\n"; > - case IOCTL_ND_CQ_MODIFY: > - return "IOCTL_ND_CQ_MODIFY\n"; > - > - case IOCTL_ND_CQ_NOTIFY: > - return "IOCTL_ND_CQ_NOTIFY\n"; > - > - > - case IOCTL_ND_LISTENER_CREATE: > - return "IOCTL_ND_LISTENER_CREATE\n"; > - > - case IOCTL_ND_LISTENER_FREE: > - return "IOCTL_ND_LISTENER_FREE\n"; > - > - case IOCTL_ND_QP_FREE: > - return "IOCTL_ND_QP_FREE\n"; > - > - case IOCTL_ND_CONNECTOR_CANCEL_IO: > - return "IOCTL_ND_CONNECTOR_CANCEL_IO\n"; > - > - case IOCTL_ND_LISTENER_CANCEL_IO: > - return "IOCTL_ND_LISTENER_CANCEL_IO\n"; > - > - case IOCTL_ND_LISTENER_BIND: > - return "IOCTL_ND_LISTENER_BIND\n"; > - > - case IOCTL_ND_LISTENER_LISTEN: > - return "IOCTL_ND_LISTENER_LISTEN\n"; > - > - case IOCTL_ND_LISTENER_GET_ADDRESS: > - return "IOCTL_ND_LISTENER_GET_ADDRESS\n"; > - > - case IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST: > - return "IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST\n"; > - > - > - > - case IOCTL_ND_CONNECTOR_CREATE: > - return "IOCTL_ND_CONNECTOR_CREATE\n"; > - > - case IOCTL_ND_CONNECTOR_FREE: > - return "IOCTL_ND_CONNECTOR_FREE\n"; > - > - case IOCTL_ND_CONNECTOR_BIND: > - return "IOCTL_ND_CONNECTOR_BIND\n"; > - > - // KYS: ALERT: ASYNCH Operation > - case IOCTL_ND_CONNECTOR_CONNECT: > - return "IOCTL_ND_CONNECTOR_CONNECT\n"; > - > - case IOCTL_ND_CONNECTOR_COMPLETE_CONNECT: > - return "IOCTL_ND_CONNECTOR_COMPLETE_CONNECT\n"; > - > - // KYS: ALERT: ASYNCH Operation > - case IOCTL_ND_CONNECTOR_ACCEPT: > - return "IOCTL_ND_CONNECTOR_ACCEPT\n"; > - > - case IOCTL_ND_CONNECTOR_REJECT: > - return "IOCTL_ND_CONNECTOR_REJECT\n"; > - > - case IOCTL_ND_CONNECTOR_GET_READ_LIMITS: > - return "IOCTL_ND_CONNECTOR_GET_READ_LIMITS\n"; > - > - case IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA: > - return "IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA\n"; > - > - case IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS: > - return "IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS\n"; > - > - case IOCTL_ND_CONNECTOR_GET_ADDRESS: > - return "IOCTL_ND_CONNECTOR_GET_ADDRESS\n"; > - > - // KYS: ALERT: ASYNCH Operation > - case IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT: > - return "IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT\n"; > - > - // KYS: ALERT: ASYNCH Operation > - case IOCTL_ND_CONNECTOR_DISCONNECT: > - return "IOCTL_ND_CONNECTOR_DISCONNECT\n"; > - > - > - > - case IOCTL_ND_QP_CREATE: > - return "IOCTL_ND_QP_CREATE\n"; > - > - case IOCTL_ND_MR_CREATE: > - return "IOCTL_ND_MR_CREATE\n"; > - > - case IOCTL_ND_MR_FREE: > - return "IOCTL_ND_MR_FREE\n"; > - case IOCTL_ND_MR_REGISTER: > - return "IOCTL_ND_MR_REGISTER\n"; > - case IOCTL_ND_MR_DEREGISTER: > - return "IOCTL_ND_MR_DEREGISTER\n"; > - case IOCTL_ND_MR_CANCEL_IO: > - return "IOCTL_ND_MR_CANCEL_IO\n"; > - case IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST: > - return "IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST\n"; > - case IOCTL_ND_QP_FLUSH: > - return "IOCTL_ND_QP_FLUSH\n"; > - > - default: > - return "Unknown IOCTL\n"; > - } > -} > -int get_irp_handle(struct hvnd_dev *nd_dev, u32 *local, void *irp_ctx) > -{ > - unsigned int local_handle; > - int ret; > - > - local_handle = atomic_inc_return(&irp_local_hdl); > - *local = local_handle; > - > - /* > - * Now asssociate the local handle with the pointer. > - */ > - ret = insert_handle(nd_dev, &nd_dev->irpidr, irp_ctx, local_handle); > - hvnd_debug("irp_ctx=%p local_handle=%u\n", irp_ctx, local_handle); > - > - if (ret) { > - hvnd_error("insert_handle failed ret=%d\n", ret); > - return ret; > - } > - > - return 0; > -} > - > -void put_irp_handle(struct hvnd_dev *nd_dev, u32 irp) > -{ > - remove_handle(nd_dev, &nd_dev->irpidr, irp); > - > -} > - > -static void init_pfn(u64 *pfn, void *addr, u32 length) > -{ > - int i; > - u32 offset = offset_in_page(addr); > - u32 num_pfn = DIV_ROUND_UP(offset + length, PAGE_SIZE); > - > - for (i = 0; i < num_pfn; i++) > - pfn[i] = virt_to_phys((u8 *)addr + > - (PAGE_SIZE * i)) >> PAGE_SHIFT; > -} > - > - > -static void user_va_init_pfn(u64 *pfn, struct ib_umem *umem) > -{ > - int entry; > - struct scatterlist *sg; > - int i = 0; > - > - for_each_sg(umem->sg_head.sgl, sg, umem->nmap, entry) { > - pfn[i++] = page_to_pfn(sg_page(sg)); > - } > -} > - > -static u32 get_local_handle(void *p) > -{ > - u64 val = (unsigned long)p; > - > - return (u32)val; > -} > - > -static int hvnd_send_pg_buffer(struct hvnd_dev *nd_dev, > - struct vmbus_packet_mpb_array *desc, > - u32 desc_size, > - void *buffer, > - u32 bufferlen, u64 cookie) > -{ > - int ret; > - int t; > - struct hvnd_cookie hvnd_cookie; > - > - hvnd_cookie.pkt = (void *)cookie; > - init_completion(&hvnd_cookie.host_event); > - > - ret = vmbus_sendpacket_mpb_desc(nd_dev->hvdev->channel, > - desc, > - desc_size, > - buffer, bufferlen, > - (u64)(&hvnd_cookie)); > - > - if (ret) { > - hvnd_error("vmbus_sendpacket_mpb_desc failed ret=%d\n", ret); > - goto err; > - } > - > - t = wait_for_completion_timeout(&hvnd_cookie.host_event, 500*HZ); > - > - if (t == 0) { > - hvnd_error("wait_for_completion_timeout timed out\n"); > - ret = -ETIMEDOUT; > - } > - > -err: > - return ret; > -} > - > -static int hvnd_send_packet(struct hvnd_dev *nd_dev, void *buffer, > - u32 bufferlen, u64 cookie, bool block) > -{ > - int ret; > - int t; > - struct hvnd_cookie hvnd_cookie; > - > - hvnd_cookie.pkt = (void *)cookie; > - init_completion(&hvnd_cookie.host_event); > - > - ret = vmbus_sendpacket(nd_dev->hvdev->channel, buffer, bufferlen, > - (u64)(&hvnd_cookie), VM_PKT_DATA_INBAND, > - VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED); > - > - if (ret) { > - hvnd_error("vmbus_send pkt failed: %d\n", ret); > - goto err; > - } > - > - if (!block) > - return ret; > - > - t = wait_for_completion_timeout(&hvnd_cookie.host_event, 500*HZ); > - > - if (t == 0) { > - hvnd_error("wait_for_completion_timeout timed out\n"); > - ret = -ETIMEDOUT; > - } > - > -err: > - return ret; > -} > - > -static int hvnd_send_pgbuf_ioctl_pkt(struct hvnd_dev *nd_dev, > - struct vmbus_packet_mpb_array *desc, > - u32 desc_size, > - struct ndv_packet_hdr_control_1 *hdr, > - u32 pkt_size, u64 cookie) > -{ > - int ret; > - int ioctl; > - > - ioctl = hdr->io_cntrl_code; > - > - > - ret = hvnd_send_pg_buffer(nd_dev, desc, desc_size, > - hdr, pkt_size, cookie); > - > - if (ret) > - return ret; > - > - if (hdr->pkt_hdr.status != 0) { > - hvnd_error("IOCTL: %s failed; status is %x\n", > - hvnd_get_op_name(ioctl), > - hdr->pkt_hdr.status); > - return -EINVAL; > - } > - > - switch (hdr->io_status) { > - case STATUS_SUCCESS: > - case STATUS_PENDING: > - return 0; > - > - default: > - hvnd_error("IOCTL: %s failed io status is %x\n", > - hvnd_get_op_name(ioctl), hdr->io_status); > - return -EINVAL; > - } > -} > - > -int hvnd_send_ioctl_pkt(struct hvnd_dev *nd_dev, > - struct ndv_packet_hdr_control_1 *hdr, > - u32 pkt_size, u64 cookie) > -{ > - int ret; > - int ioctl; > - bool block; > - > - block = (hdr->irp_handle.val64 == 0) ? true : false; > - > - > - ioctl = hdr->io_cntrl_code; > - > - ret = hvnd_send_packet(nd_dev, hdr, pkt_size, cookie, block); > - > - if (ret) > - return ret; > - > - if (!block) > - return ret; > - > - if (hdr->pkt_hdr.status != 0) { > - hvnd_error("IOCTL: %s failed; status is %x\n", > - hvnd_get_op_name(ioctl), hdr->pkt_hdr.status); > - return -EINVAL; > - } > - > - switch (hdr->io_status) { > - case STATUS_SUCCESS: > - case STATUS_PENDING: > - return 0; > - > - default: > - hvnd_warn("IOCTL: %s failed io status is %x\n", > - hvnd_get_op_name(ioctl), hdr->io_status); > - return -EINVAL; > - } > -} > - > -void hvnd_init_hdr(struct ndv_packet_hdr_control_1 *hdr, > - u32 data_sz, u32 local, u32 remote, > - u32 ioctl_code, > - u32 ext_data_sz, u32 ext_data_offset, > - u64 irp_handle) > - > -{ > - int pkt_type; > - > - pkt_type = NDV_PKT_ID1_CONTROL; > - NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); > - hdr->pkt_hdr.packet_type = pkt_type; > - hdr->pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); > - hdr->pkt_hdr.data_sz = data_sz; > - > - hdr->pkt_hdr.status = 0; > - > - hdr->file_handle.local = local; > - hdr->file_handle.remote = remote; > - hdr->irp_handle.val64 = irp_handle; > - > - hdr->io_cntrl_code = ioctl_code; > - hdr->output_buf_sz = data_sz - ext_data_sz; > - hdr->input_buf_sz = data_sz - ext_data_sz; > - > - hdr->input_output_buf_offset = 0; > - > - hdr->extended_data.size = ext_data_sz; > - hdr->extended_data.offset = ext_data_offset; > -} > - > - > -int hvnd_create_file(struct hvnd_dev *nd_dev, void *uctx, > - struct ndv_pkt_hdr_create_1 *create, u32 file_flags) > -{ > - int ret; > - int pkt_type; > - > - > - pkt_type = NDV_PKT_ID1_CREATE; > - NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); > - create->pkt_hdr.packet_type = pkt_type; > - create->pkt_hdr.hdr_sz = sizeof(struct ndv_pkt_hdr_create_1); > - create->pkt_hdr.data_sz = 0; > - > - create->handle.local = get_local_handle(uctx); > - create->access_mask = STANDARD_RIGHTS_ALL; > - create->open_options = OPEN_EXISTING; > - create->file_attributes = FILE_ATTRIBUTE_NORMAL | file_flags; > - create->share_access = FILE_SHARE_ALL; > - > - ret = hvnd_send_packet(nd_dev, create, > - sizeof(struct ndv_pkt_hdr_create_1), > - (unsigned long)create, true); > - return ret; > -} > - > -int hvnd_cleanup_file(struct hvnd_dev *nd_dev, u32 local, u32 remote) > -{ > - int ret; > - int pkt_type; > - struct ndv_pkt_hdr_cleanup_1 cleanup_pkt; > - > - > - pkt_type = NDV_PKT_ID1_CLEANUP; > - NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); > - > - cleanup_pkt.pkt_hdr.packet_type = pkt_type; > - cleanup_pkt.pkt_hdr.hdr_sz = sizeof(struct ndv_pkt_hdr_create_1); > - cleanup_pkt.pkt_hdr.data_sz = 0; > - > - cleanup_pkt.handle.local = local; > - cleanup_pkt.handle.remote = remote; > - > - ret = hvnd_send_packet(nd_dev, &cleanup_pkt, > - sizeof(struct ndv_pkt_hdr_create_1), > - (unsigned long)&cleanup_pkt, true); > - return ret; > -} > - > - > -static int hvnd_do_ioctl(struct hvnd_dev *nd_dev, u32 ioctl, > - struct pkt_nd_provider_ioctl *pkt, > - union ndv_context_handle *hdr_handle, > - struct nd_handle *ioctl_handle, > - u8 *buf, u32 buf_len, bool c_in, bool c_out, u64 irp_val) > -{ > - int ret; > - int pkt_type; > - > - pkt_type = NDV_PKT_ID1_CONTROL; > - NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); > - > - pkt->hdr.pkt_hdr.packet_type = pkt_type; > - pkt->hdr.pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); > - pkt->hdr.pkt_hdr.data_sz = (sizeof(struct pkt_nd_provider_ioctl) - > - sizeof(struct ndv_packet_hdr_control_1)); > - > - pkt->hdr.file_handle.local = hdr_handle->local; > - pkt->hdr.file_handle.remote = hdr_handle->remote; > - hvnd_debug("create handle local: %x remote: %x\n", > - hdr_handle->local, hdr_handle->remote); > - > - pkt->hdr.irp_handle.val64 = irp_val; > - > - pkt->hdr.io_cntrl_code = ioctl; > - pkt->hdr.output_buf_sz = sizeof(struct nd_ioctl); > - pkt->hdr.input_buf_sz = sizeof(struct nd_ioctl); > - pkt->hdr.input_output_buf_offset = 0; > - memset(&pkt->ioctl.handle, 0, sizeof(struct nd_handle)); > - pkt->ioctl.handle.version = ND_VERSION_1; > - > - switch (ioctl) { > - case IOCTL_ND_PROVIDER_BIND_FILE: > - pkt->ioctl.handle.handle = ioctl_handle->handle; > - break; > - default: > - break; > - }; > - > - /* > - * Copy the input buffer, if needed. > - */ > - > - if (c_in && (buf != NULL)) > - memcpy(pkt->ioctl.raw_buffer, buf, buf_len); > - > - ret = hvnd_send_packet(nd_dev, pkt, > - sizeof(struct pkt_nd_provider_ioctl), > - (unsigned long)pkt, true); > - > - if (ret) > - return ret; > - > - if (c_out && (buf != NULL)) > - memcpy(buf, pkt->ioctl.raw_buffer, buf_len); > - > - return ret; > -} > - > -static int idr_callback(int id, void *p, void *data) > -{ > - if (p == data) > - return id; > - return 0; > -} > - > -void remove_uctx(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx) > -{ > - int pid = current_pid(); > - unsigned long flags; > - int id; > - > - if (get_uctx(nd_dev, pid) == uctx) > - remove_handle(nd_dev, &nd_dev->uctxidr, pid); > - else { > - hvnd_warn("uctx %p not found on pid %d, doing a idr search\n", > - uctx, current_pid()); > - > - spin_lock_irqsave(&nd_dev->id_lock, flags); > - id = idr_for_each(&nd_dev->uctxidr, idr_callback, uctx); > - spin_unlock_irqrestore(&nd_dev->id_lock, flags); > - > - if (id) > - remove_handle(nd_dev, &nd_dev->uctxidr, id); > - else { > - hvnd_error("uctx %p not found in idr table\n", uctx); > - return; > - } > - } > - > - kfree(uctx); > -} > - > -int hvnd_close_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx) > -{ > - int ret; > - > - /* > - * First close the adaptor. > - */ > - > - ret = hvnd_free_handle(nd_dev, uctx, > - uctx->adaptor_hdl, > - IOCTL_ND_ADAPTER_CLOSE); > - > - if (ret) > - hvnd_error("Adaptor close failed; ret is %x\n", ret); > - > - /* > - * Now close the two files we created. > - */ > - > - ret = hvnd_cleanup_file(nd_dev, uctx->file_handle_ovl.local, > - uctx->file_handle_ovl.remote); > - > - if (ret) > - hvnd_error("file cleanup failed; ret is %x\n", ret); > - > - ret = hvnd_cleanup_file(nd_dev, uctx->file_handle.local, > - uctx->file_handle.remote); > - > - if (ret) > - hvnd_error("File cleanup failed; ret is %x\n", ret); > - > - /* > - * Remove the uctx from the ID table. > - */ > - remove_uctx(nd_dev, uctx); > - > - return 0; > -} > - > -int hvnd_open_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx) > -{ > - int ret; > - struct pkt_nd_provider_ioctl *pr_init = &uctx->pr_init_pkt; > - int pkt_type; > - struct nd_handle ioctl_handle; > - struct pkt_nd_open_adapter *pr_o_adap = &uctx->o_adap_pkt; > - > - ret = hvnd_create_file(nd_dev, uctx, &uctx->create_pkt, 0); > - if (ret) { > - hvnd_error("hvnd_create_file failed ret=%d\n", ret); > - goto error_cr; > - } > - > - if (uctx->create_pkt.pkt_hdr.status != 0) { > - hvnd_error("create File failed; status is %d\n", > - uctx->create_pkt.pkt_hdr.status); > - ret = -EINVAL; > - goto error_cr; > - } > - > - uctx->file_handle.local = uctx->create_pkt.handle.local; > - uctx->file_handle.remote = uctx->create_pkt.handle.remote; > - hvnd_debug("INITIALIZE PROVIDER\n"); > - /* > - * Now Initialize the Provider. > - */ > - ioctl_handle.handle = 0; > - ret = hvnd_do_ioctl(nd_dev, IOCTL_ND_PROVIDER_INIT, pr_init, > - &uctx->create_pkt.handle, > - &ioctl_handle, NULL, 0, false, false, 0); > - > - if (ret) { > - ret = -EINVAL; > - goto error_pr_init; > - } > - > - if (pr_init->hdr.pkt_hdr.status != 0) { > - hvnd_error("Provider INIT failed; status is %d\n", > - pr_init->hdr.pkt_hdr.status); > - ret = -EINVAL; > - goto error_pr_init; > - } > - > - if (pr_init->hdr.io_status != 0) { > - hvnd_error("Provider INIT failed; io status is %d\n", > - pr_init->hdr.io_status); > - ret = -EINVAL; > - goto error_pr_init; > - } > - > - /* > - * Now create the overlap file. > - */ > - > - hvnd_debug("CREATE OVERLAP FILE\n"); > - ret = hvnd_create_file(nd_dev, uctx, &uctx->create_pkt_ovl, > - FILE_FLAG_OVERLAPPED); > - if (ret) { > - hvnd_error("hvnd_create_file failed ret=%d\n", ret); > - goto error_pr_init; > - } > - > - if (uctx->create_pkt_ovl.pkt_hdr.status != 0) { > - hvnd_error("create Overlap File failed; status is %d\n", > - uctx->create_pkt_ovl.pkt_hdr.status); > - ret = -EINVAL; > - goto error_pr_init; > - } > - uctx->file_handle_ovl.local = uctx->create_pkt_ovl.handle.local; > - uctx->file_handle_ovl.remote = uctx->create_pkt_ovl.handle.remote; > - > - /* > - * Now bind the two file handles together. > - */ > - > - hvnd_debug("BIND FILE IOCTL remote handle: %d local handle: %d\n", > - uctx->create_pkt_ovl.handle.remote, > - uctx->create_pkt_ovl.handle.local); > - > - ioctl_handle.handle = uctx->create_pkt_ovl.handle.val64; > - ret = hvnd_do_ioctl(nd_dev, IOCTL_ND_PROVIDER_BIND_FILE, pr_init, > - &uctx->create_pkt.handle, > - &ioctl_handle, NULL, 0, false, false, 0); > - > - if (ret) { > - ret = -EINVAL; > - goto error_file_bind; > - } > - if (pr_init->hdr.pkt_hdr.status != 0) { > - hvnd_error("Provider File bind failed; status is %d\n", > - pr_init->hdr.pkt_hdr.status); > - ret = -EINVAL; > - goto error_file_bind; > - } > - if (pr_init->hdr.io_status != 0) { > - hvnd_error("Provider INIT failed; io status is %d\n", > - pr_init->hdr.io_status); > - ret = -EINVAL; > - goto error_file_bind; > - } > - > - /* > - * Now open the adaptor. > - */ > - > - hvnd_debug("OPENING THE ADAPTOR\n"); > - > - pkt_type = NDV_PKT_ID1_CONTROL; > - NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); > - pr_o_adap->hdr.pkt_hdr.packet_type = pkt_type; > - pr_o_adap->hdr.pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); > - pr_o_adap->hdr.pkt_hdr.data_sz = sizeof(struct pkt_nd_open_adapter) - > - sizeof(struct ndv_packet_hdr_control_1); > - > - pr_o_adap->hdr.pkt_hdr.status = 0; > - > - hvnd_debug("hdr sz is %d\n", pr_o_adap->hdr.pkt_hdr.hdr_sz); > - hvnd_debug("data sz is %d\n", pr_o_adap->hdr.pkt_hdr.data_sz); > - > - pr_o_adap->hdr.file_handle.local = uctx->create_pkt.handle.local; > - pr_o_adap->hdr.file_handle.remote = uctx->create_pkt.handle.remote; > - hvnd_debug("create handle local is %x\n", > - uctx->create_pkt.handle.local); > - hvnd_debug("create handle remote is %x\n", > - uctx->create_pkt.handle.remote); > - pr_o_adap->hdr.irp_handle.val64 = 0; > - > - pr_o_adap->hdr.io_cntrl_code = IOCTL_ND_ADAPTER_OPEN; > - pr_o_adap->hdr.output_buf_sz = pr_o_adap->hdr.pkt_hdr.data_sz - > - sizeof(struct extended_data_oad); > - pr_o_adap->hdr.input_buf_sz = pr_o_adap->hdr.pkt_hdr.data_sz - > - sizeof(struct extended_data_oad); > - > - hvnd_debug("output buf sz is %d\n", pr_o_adap->hdr.output_buf_sz); > - hvnd_debug("input buf sz is %d\n", pr_o_adap->hdr.input_buf_sz); > - hvnd_debug("packet size is %d\n", > - (int)sizeof(struct pkt_nd_open_adapter)); > - > - pr_o_adap->hdr.input_output_buf_offset = 0; > - > - > - pr_o_adap->hdr.extended_data.size = sizeof(struct extended_data_oad); > - pr_o_adap->hdr.extended_data.offset = > - offsetof(struct pkt_nd_open_adapter, ext_data) - > - sizeof(struct ndv_packet_hdr_control_1); > - > - hvnd_debug("size of the extended data size: %d\n", > - (int)sizeof(struct extended_data_oad)); > - hvnd_debug("offset of extended data: %d\n", > - pr_o_adap->hdr.extended_data.offset); > - > - /* > - * Now fill out the ioctl section. > - */ > - > - pr_o_adap->ioctl.input.version = ND_VERSION_1; > - pr_o_adap->ioctl.input.ce_mapping_cnt = > - RTL_NUMBER_OF(pr_o_adap->mappings.ctx_input.mappings); > - > - hvnd_debug("ce_mapping cnt is %d\n", > - pr_o_adap->ioctl.input.ce_mapping_cnt); > - > - pr_o_adap->ioctl.input.cb_mapping_offset = sizeof(union oad_ioctl); > - hvnd_debug("cb_mapping offset is %d\n", > - pr_o_adap->ioctl.input.cb_mapping_offset); > - pr_o_adap->ioctl.input.adapter_id = (u64)nd_dev; > - > - pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_UAR].map_type = ND_MAP_IOSPACE; > - pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_UAR].map_io_space.cache_type = ND_NON_CACHED; > - pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_UAR].map_io_space.cb_length = 4096; > - > - pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_BF].map_type = ND_MAP_IOSPACE; > - pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_BF].map_io_space.cache_type = ND_WRITE_COMBINED; > - pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_BF].map_io_space.cb_length = 4096; > - > - /* > - * Fill in the extended data. > - */ > - pr_o_adap->ext_data.cnt = IBV_GET_CONTEXT_MAPPING_MAX; > - > - ret = hvnd_send_packet(nd_dev, pr_o_adap, > - sizeof(struct pkt_nd_open_adapter), > - (unsigned long)pr_o_adap, true); > - if (ret) { > - ret = -EINVAL; > - goto error_file_bind; > - } > - > - if (pr_o_adap->hdr.pkt_hdr.status != 0) { > - hvnd_error("Open adaptor failed; status is %d\n", > - pr_o_adap->hdr.pkt_hdr.status); > - ret = -EINVAL; > - goto error_file_bind; > - } > - > - if (pr_o_adap->hdr.io_status != 0) { > - hvnd_error("Open adaptor failed;io status is %d\n", > - pr_o_adap->hdr.io_status); > - ret = -EINVAL; > - goto error_file_bind; > - } > - > - /* > - * Copy the necessary response from the host. > - */ > - > - uctx->adaptor_hdl = pr_o_adap->ioctl.resrc_desc.handle; > - > - > - hvnd_debug("adaptor handle: %p\n", (void *)uctx->adaptor_hdl); > - > - uctx->uar_base = > - pr_o_adap->mappings.ctx_output.mapping_results[IBV_GET_CONTEXT_UAR].info; > - hvnd_debug("uar base: %p\n", (void *)uctx->uar_base); > - > - uctx->bf_base = > - pr_o_adap->mappings.ctx_output.mapping_results[IBV_GET_CONTEXT_BF].info; > - hvnd_debug("bf base: %p\n", (void *)uctx->bf_base); > - > - uctx->bf_buf_size = > - pr_o_adap->mappings.ctx_output.bf_buf_size; > - hvnd_debug("bf buf size: %d\n", uctx->bf_buf_size); > - > - uctx->bf_offset = > - pr_o_adap->mappings.ctx_output.bf_offset; > - hvnd_debug("bf offset: %d\n", uctx->bf_offset); > - > - uctx->cqe_size = > - pr_o_adap->mappings.ctx_output.cqe_size; > - hvnd_debug("cqe size: %d\n", uctx->cqe_size); > - > - uctx->max_qp_wr = > - pr_o_adap->mappings.ctx_output.max_qp_wr; > - hvnd_debug("max qp wr: %d\n", uctx->max_qp_wr); > - > - uctx->max_sge = > - pr_o_adap->mappings.ctx_output.max_sge; > - hvnd_debug("max sge: %d\n", uctx->max_sge); > - > - uctx->max_cqe = > - pr_o_adap->mappings.ctx_output.max_cqe; > - hvnd_debug("max cqe: %d\n", uctx->max_cqe); > - > - uctx->num_qps = > - pr_o_adap->mappings.ctx_output.qp_tab_size; > - hvnd_debug("num qps: %d\n", uctx->num_qps); > - > - /* > - * Now query the adaptor and stash away the adaptor info. > - */ > - > - ret = hvnd_query_adaptor(nd_dev, uctx); > - if (ret) { > - hvnd_error("Query Adaptor failed; ret is %d\n", ret); > - goto query_err; > - } > - > - return ret; > - > -query_err: > - hvnd_free_handle(nd_dev, uctx, > - uctx->adaptor_hdl, > - IOCTL_ND_ADAPTER_CLOSE); > - > - hvnd_error("Open Adaptor Failed!!\n"); > - > -error_file_bind: > - hvnd_cleanup_file(nd_dev, uctx->file_handle_ovl.local, > - uctx->file_handle_ovl.remote); > - > -error_pr_init: > - hvnd_cleanup_file(nd_dev, uctx->file_handle.local, > - uctx->file_handle.remote); > - > -error_cr: > - if (get_uctx(nd_dev, current_pid()) != NULL) > - remove_handle(nd_dev, &nd_dev->uctxidr, current_pid()); > - > - return ret; > -} > - > -int hvnd_create_cq(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - struct hvnd_cq *cq) > -{ > - int ret; > - struct pkt_nd_create_cq *pkt; > - int num_pfn, num_db_pfn, num_sn_pfn; > - int cq_pkt_size; > - unsigned int cq_buf_size, offset; > - u32 ext_data_sz; > - u32 ext_data_offset; > - > - /* > - * Now create CQ. > - * First compute the number of PFNs we need to accomodate: > - * One each for door bell and arm_sn and pages in cq buffer. > - */ > - cq_buf_size = (cq->entries * uctx->cqe_size); > - offset = offset_in_page(cq->cq_buf); > - num_pfn = DIV_ROUND_UP(offset + cq_buf_size, PAGE_SIZE); > - > - offset = offset_in_page(cq->db_addr); > - num_db_pfn = DIV_ROUND_UP(offset + 8, PAGE_SIZE); > - > - offset = offset_in_page(&cq->arm_sn); > - num_sn_pfn = DIV_ROUND_UP(offset + 4, PAGE_SIZE); > - > - cq_pkt_size = sizeof(struct pkt_nd_create_cq) + > - (num_pfn * sizeof(u64)); > - > - ext_data_sz = sizeof(struct create_cq_ext_data) + > - (num_pfn * sizeof(u64)); > - ext_data_offset = offsetof(struct pkt_nd_create_cq, ext_data) - > - sizeof(struct ndv_packet_hdr_control_1); > - > - hvnd_debug("CREATE CQ, num user addr pfns is %d\n", num_pfn); > - hvnd_debug("CREATE CQ, num db pfns is %d\n", num_db_pfn); > - > - pkt = kzalloc(cq_pkt_size, GFP_KERNEL); > - > - if (!pkt) > - return -ENOMEM; > - > - hvnd_init_hdr(&pkt->hdr, > - (cq_pkt_size - > - sizeof(struct ndv_packet_hdr_control_1)), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_CQ_CREATE, > - ext_data_sz, > - ext_data_offset, > - 0); > - > - /* > - * Now fill out the ioctl section. > - */ > - > - pkt->ioctl.input.version = ND_VERSION_1; > - pkt->ioctl.input.queue_depth = cq->entries; > - pkt->ioctl.input.ce_mapping_cnt = MLX4_IB_CREATE_CQ_MAPPING_MAX; > - pkt->ioctl.input.cb_mapping_offset = sizeof(union create_cq_ioctl); > - > - hvnd_debug("ce_mapping cnt is %d\n", pkt->ioctl.input.ce_mapping_cnt); > - hvnd_debug("cb_mapping offset is %d\n", > - pkt->ioctl.input.cb_mapping_offset); > - > - pkt->ioctl.input.adapter_handle = uctx->adaptor_hdl; > - pkt->ioctl.input.affinity.mask = 0; > - pkt->ioctl.input.affinity.group = -1; > - > - // 0 for usermode CQ arming > - pkt->mappings.cq_in.flags = 0; > - > - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_BUF].map_memory.map_type = ND_MAP_MEMORY; > - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_BUF].map_memory.access_type = ND_MODIFY_ACCESS; > - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_BUF].map_memory.address = (u64)cq->cq_buf; > - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_BUF].map_memory.cb_length = (cq->entries * uctx->cqe_size); > - > - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_DB].map_memory.map_type = ND_MAP_MEMORY_COALLESCE; > - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_DB].map_memory.access_type = ND_WRITE_ACCESS; > - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_DB].map_memory.address = (u64)cq->db_addr; > - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_DB].map_memory.cb_length = 8; //size of two ints > - > - > - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_ARM_SN].map_memory.map_type = ND_MAP_MEMORY; > - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_ARM_SN].map_memory.access_type = ND_MODIFY_ACCESS; > - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_ARM_SN].map_memory.address = (u64)&cq->arm_sn; > - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_ARM_SN].map_memory.cb_length = 4; //size of one int > - /* > - * Fill in the extended data. > - */ > - > - pkt->ext_data.cnt = 3; > - pkt->ext_data.fields[MLX4_IB_CREATE_CQ_BUF].size = (sizeof(struct gpa_range) + (num_pfn * sizeof(u64))); > - pkt->ext_data.fields[MLX4_IB_CREATE_CQ_BUF].offset = offsetof(struct create_cq_ext_data, cqbuf_gpa); > - > - pkt->ext_data.fields[MLX4_IB_CREATE_CQ_DB].size = sizeof(struct cq_db_gpa); > - pkt->ext_data.fields[MLX4_IB_CREATE_CQ_DB].offset = offsetof(struct create_cq_ext_data, db_gpa); > - > - pkt->ext_data.fields[MLX4_IB_CREATE_CQ_ARM_SN].size = sizeof(struct cq_db_gpa); > - pkt->ext_data.fields[MLX4_IB_CREATE_CQ_ARM_SN].offset = offsetof(struct create_cq_ext_data, sn_gpa); > - > - /* > - * Fill up the gpa range for cq buffer. > - */ > - > - pkt->ext_data.db_gpa.byte_count = 8; > - pkt->ext_data.db_gpa.byte_offset = offset_in_page(cq->db_addr); > - user_va_init_pfn(&pkt->ext_data.db_gpa.pfn_array[0], cq->db_umem); > - > - pkt->ext_data.sn_gpa.byte_count = 4; > - pkt->ext_data.sn_gpa.byte_offset = offset_in_page(&cq->arm_sn); > - init_pfn(&pkt->ext_data.sn_gpa.pfn_array[0], > - &cq->arm_sn, > - 4); > - > - pkt->ext_data.cqbuf_gpa.byte_count = (cq->entries * uctx->cqe_size); > - pkt->ext_data.cqbuf_gpa.byte_offset = offset_in_page(cq->cq_buf); > - user_va_init_pfn(&pkt->ext_data.cqbuf_gpa.pfn_array[0], cq->umem); > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt->hdr, cq_pkt_size, (u64)pkt); > - > - if (ret) > - goto cr_cq_err; > - > - /* > - * Copy the necessary response from the host. > - */ > - cq->cqn = pkt->mappings.cq_resp.cqn; > - cq->cqe = pkt->mappings.cq_resp.cqe; > - cq->cq_handle = pkt->ioctl.resrc_desc.handle; > - > - ret = insert_handle(nd_dev, &nd_dev->cqidr, cq, cq->cqn); > - > - if (ret) > - goto cr_cq_err; > - hvnd_debug("CQ create after success cqn is %d\n", cq->cqn); > - hvnd_debug("CQ create after success cqe is %d\n", cq->cqe); > - hvnd_debug("CQ create after success cq handle is %p\n", > - (void *)cq->cq_handle); > - > -cr_cq_err: > - kfree(pkt); > - return ret; > -} > - > -int hvnd_destroy_cq(struct hvnd_dev *nd_dev, struct hvnd_cq *cq) > -{ > - struct pkt_nd_free_cq free_cq_pkt; > - int ret; > - > - // KYS try to avoid having to zero everything > - memset(&free_cq_pkt, 0, sizeof(free_cq_pkt)); > - hvnd_init_hdr(&free_cq_pkt.hdr, > - sizeof(struct pkt_nd_free_cq) - > - sizeof(struct ndv_packet_hdr_control_1), > - cq->uctx->create_pkt.handle.local, > - cq->uctx->create_pkt.handle.remote, > - IOCTL_ND_CQ_FREE, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - free_cq_pkt.ioctl.in.version = ND_VERSION_1; > - free_cq_pkt.ioctl.in.handle = cq->cq_handle; > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &free_cq_pkt.hdr, > - sizeof(struct pkt_nd_free_cq), > - (u64)&free_cq_pkt); > - > - if (ret) > - goto free_cq_err; > - > - remove_handle(nd_dev, &nd_dev->cqidr, cq->cqn); > - > - return 0; > - > -free_cq_err: > - return ret; > -} > - > - > -int hvnd_notify_cq(struct hvnd_dev *nd_dev, struct hvnd_cq *cq, > - u32 notify_type, u64 irp_handle) > -{ > - struct pkt_nd_notify_cq notify_cq_pkt; > - int ret; > - union ndv_context_handle irp_fhandle; > - > - irp_fhandle.local = cq->ep_object.local_irp; > - > - // KYS try to avoid having to zero everything > - memset(¬ify_cq_pkt, 0, sizeof(notify_cq_pkt)); > - hvnd_init_hdr(¬ify_cq_pkt.hdr, > - sizeof(struct pkt_nd_notify_cq) - > - sizeof(struct ndv_packet_hdr_control_1), > - cq->uctx->create_pkt.handle.local, > - cq->uctx->create_pkt.handle.remote, > - IOCTL_ND_CQ_NOTIFY, 0, 0, irp_fhandle.val64); > - > - /* > - * Now fill in the ioctl section. > - */ > - notify_cq_pkt.ioctl.in.version = ND_VERSION_1; > - notify_cq_pkt.ioctl.in.cq_handle = cq->cq_handle; > - notify_cq_pkt.ioctl.in.type = notify_type; > - > - > - ret = hvnd_send_ioctl_pkt(nd_dev, ¬ify_cq_pkt.hdr, > - sizeof(struct pkt_nd_notify_cq), > - (u64)¬ify_cq_pkt); > - > - return ret; > -} > - > -/* > - * Memory region operations. > - */ > -int hvnd_cr_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 pd_handle, u64 *mr_handle) > -{ > - struct pkt_nd_create_mr pkt; > - int ret; > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_MR_CREATE, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.version = ND_VERSION_1; > - pkt.ioctl.in.handle = pd_handle; > - hvnd_debug("PD handle is %p\n", (void *)pd_handle); > - pkt.ioctl.in.reserved = 0; > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - /* > - * Copy the handle. > - */ > - hvnd_debug("mr handle is %p\n", (void *)pkt.ioctl.out); > - *mr_handle = pkt.ioctl.out; > - > - return 0; > - > -err: > - hvnd_error("create mr failed: %d\n", ret); > - return ret; > - > -} > - > -int hvnd_free_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 handle) > -{ > - return hvnd_free_handle(nd_dev, uctx, handle, IOCTL_ND_MR_FREE); > -} > - > -int hvnd_deregister_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 handle) > -{ > - struct pkt_nd_deregister_mr pkt; > - int ret; > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_MR_DEREGISTER, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.version = ND_VERSION_1; > - pkt.ioctl.in.handle = handle; > - pkt.ioctl.in.reserved = 0; > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - return 0; > - > -err: > - hvnd_error("de-register mr failed: %d\n", ret); > - return ret; > - > -} > - > -static inline u32 hvnd_convert_access(int acc) > -{ > - return (acc & IB_ACCESS_REMOTE_WRITE ? ND_MR_FLAG_ALLOW_REMOTE_WRITE : 0) | > - (acc & IB_ACCESS_REMOTE_READ ? ND_MR_FLAG_ALLOW_REMOTE_READ : 0) | > - (acc & IB_ACCESS_LOCAL_WRITE ? ND_MR_FLAG_ALLOW_LOCAL_WRITE : 0); > -} > - > - > -int hvnd_mr_register(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - struct hvnd_mr *mr) > -{ > - struct pkt_nd_register_mr pkt; > - int ret; > - struct hv_mpb_array *pb; > - struct vmbus_packet_mpb_array *tpb; > - int sz_leaf; > - int num_pgs; > - int i = 0; > - int ext_data_sz; > - u32 acc_flags; > - u32 desc_size; > - int pkt_type; > - > - /* > - * The user address is passed in via a two level structure. > - * An Array of struct hv_page_buffer will be used to describe > - * the user memory. The pages containing this array will be descibed > - * in another array of struct hv_page_buffer. We pass this seconed level > - * array to the host. > - */ > - > - hvnd_debug("ib_umem_page_count(mr->umem)=%d\n", > - ib_umem_page_count(mr->umem)); > - > - sz_leaf = ib_umem_page_count(mr->umem) * sizeof(u64) + > - sizeof(struct hv_mpb_array); > - > - pb = (struct hv_mpb_array *) __get_free_pages(GFP_KERNEL|__GFP_ZERO, > - get_order(sz_leaf)); > - > - if (pb == NULL) > - return -ENOMEM; > - /* > - * Allocate an array of hv_page_buffer to describe the first level. > - */ > - num_pgs = DIV_ROUND_UP(sz_leaf, PAGE_SIZE); > - hvnd_debug("num pages in the top array is %d\n", num_pgs); > - > - desc_size = (num_pgs * sizeof(u64) + > - sizeof(struct vmbus_packet_mpb_array)); > - tpb = (struct vmbus_packet_mpb_array *) > - __get_free_pages(GFP_KERNEL|__GFP_ZERO, get_order(desc_size)); > - > - if (tpb == NULL) { > - free_pages((unsigned long)pb, get_order(sz_leaf)); > - return -ENOMEM; > - } > - > - hvnd_debug("sz leaf: %d; pgs in top %d\n", sz_leaf, num_pgs); > - > - /* > - * Now fill the leaf level array. > - */ > - pb->len = mr->length; > - pb->offset = offset_in_page(mr->start); > - user_va_init_pfn(pb->pfn_array, mr->umem); > - > - /* > - * Now fill out the top level array. > - */ > - for (i = 0; i < num_pgs; i++) { > - tpb->range.pfn_array[i] = virt_to_phys( > - (u8 *)pb + (PAGE_SIZE * i)) >> PAGE_SHIFT; > - hvnd_debug("virtual address = %p\n", > - (u8 *)pb + (PAGE_SIZE * i)); > - hvnd_debug("physical address = %llx\n", > - virt_to_phys((u8 *)pb + (PAGE_SIZE * i))); > - hvnd_debug("tpb->range.pfn_array[%d]=%llx\n", i, > - tpb->range.pfn_array[i]); > - } > - > - tpb->range.offset = 8; > - tpb->range.len = ib_umem_page_count(mr->umem) * sizeof(u64); > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - ext_data_sz = (ib_umem_page_count(mr->umem) * sizeof(u64)); > - acc_flags = ND_MR_FLAG_DO_NOT_SECURE_VM | hvnd_convert_access(mr->acc); > - hvnd_debug("memory register access flags are: %x\n", acc_flags); > - > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_MR_REGISTER, 0, 0, 0); > - > - /* > - * The memory registration call uses a different mechanism to pass > - * pfn information. > - */ > - > - pkt_type = pkt.hdr.pkt_hdr.packet_type; > - NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTION_EXTERNAL_DATA); > - pkt.hdr.pkt_hdr.packet_type = pkt_type; > - > - pkt.hdr.extended_data.size = ext_data_sz; > - pkt.hdr.extended_data.offset = 0; > - /* > - * Now fill out the ioctl. > - */ > - > - pkt.ioctl.in.header.version = ND_VERSION_1; > - pkt.ioctl.in.header.flags = acc_flags; > - pkt.ioctl.in.header.cb_length = mr->length; > - pkt.ioctl.in.header.target_addr = mr->virt; > - pkt.ioctl.in.header.mr_handle = mr->mr_handle; > - pkt.ioctl.in.address = mr->virt; > - > - /* > - * Now send the packet to the host. > - */ > - > - ret = hvnd_send_pgbuf_ioctl_pkt(nd_dev, > - tpb, desc_size, > - &pkt.hdr, > - sizeof(pkt), > - (unsigned long)&pkt); > - > - if (ret) > - goto err; > - > - hvnd_info("MR REGISTRATION SUCCESS\n"); > - /* > - * Copy the mr registration data. > - */ > - hvnd_debug("mr registration lkey %x\n", pkt.ioctl.out.lkey); > - hvnd_debug("mr registration rkey %x\n", pkt.ioctl.out.rkey); > - > - mr->mr_lkey = pkt.ioctl.out.lkey; > - mr->mr_rkey = pkt.ioctl.out.rkey; > - > - mr->ibmr.lkey = mr->mr_lkey; > - mr->ibmr.rkey = be32_to_cpu(mr->mr_rkey); > - hvnd_debug("ibmr registration lkey %x\n", mr->ibmr.lkey); > - hvnd_debug("ibmr registration rkey %x\n", mr->ibmr.rkey); > - > - free_pages((unsigned long)pb, get_order(sz_leaf)); > - free_pages((unsigned long)tpb, get_order(desc_size)); > - > - return 0; > - > -err: > - free_pages((unsigned long)pb, get_order(sz_leaf)); > - free_pages((unsigned long)tpb, get_order(desc_size)); > - > - hvnd_error("mr register failed: %d\n", ret); > - return ret; > -} > - > -/* > - * Listener operations. > - */ > -int hvnd_cr_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 *listener_handle) > -{ > - struct pkt_nd_cr_listener pkt; > - int ret; > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_LISTENER_CREATE, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.hdr.version = ND_VERSION_1; > - pkt.ioctl.in.hdr.handle = uctx->adaptor_hdl; > - hvnd_debug("Adaptor handle is %p\n", (void *)uctx->adaptor_hdl); > - pkt.ioctl.in.hdr.reserved = 0; > - pkt.ioctl.in.to_semantics = false; > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - /* > - * Copy the listener handle. > - */ > - hvnd_debug("listener handle is %p\n", (void *)pkt.ioctl.out); > - *listener_handle = pkt.ioctl.out; > - > - return 0; > - > -err: > - hvnd_error("create listener failed: ret=%d uctx=%p adaptor handle=%llu\n", > - ret, uctx, uctx->adaptor_hdl); > - return ret; > - > -} > - > -int hvnd_free_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 listener_handle) > -{ > - struct pkt_nd_free_listener pkt; > - int ret; > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_LISTENER_FREE, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.version = ND_VERSION_1; > - pkt.ioctl.in.handle = listener_handle; > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - return 0; > - > -err: > - hvnd_error("free listener failed: %d\n", ret); > - return ret; > -} > - > -int hvnd_bind_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 listener_handle, union nd_sockaddr_inet *addr) > -{ > - struct pkt_nd_bind_listener pkt; > - kuid_t uid = current_uid(); > - int ret; > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_LISTENER_BIND, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.hdr.version = ND_VERSION_1; > - pkt.ioctl.in.hdr.handle = listener_handle; > - pkt.ioctl.in.hdr.reserved = 0; > - > - pkt.ioctl.in.authentication_id = (u32)uid.val; > - pkt.ioctl.in.is_admin = false; > - > - memcpy(&pkt.ioctl.in.hdr.address, addr, sizeof(*addr)); > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - return 0; > - > -err: > - hvnd_error("bind listener failed: %d\n", ret); > - return ret; > -} > - > -int hvnd_listen_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 listener_handle, u32 backlog) > -{ > - struct pkt_nd_listen_listener pkt; > - int ret; > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_LISTENER_LISTEN, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.version = ND_VERSION_1; > - pkt.ioctl.in.listener_handle = listener_handle; > - pkt.ioctl.in.back_log = backlog; > - > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - return 0; > - > -err: > - hvnd_error("listen listener failed: %d\n", ret); > - return ret; > -} > - > -int hvnd_get_addr_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 listener_handle, union nd_sockaddr_inet *addr) > -{ > - struct pkt_nd_get_addr_listener pkt; > - int ret; > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_LISTENER_GET_ADDRESS, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.version = ND_VERSION_1; > - pkt.ioctl.in.handle = listener_handle; > - > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - /* > - * Copy the adddress. > - */ > - > - memcpy(addr, &pkt.ioctl.out, sizeof(union nd_sockaddr_inet)); > - > - return 0; > - > -err: > - hvnd_error("listen listener failed: %d\n", ret); > - return ret; > -} > - > -int hvnd_get_connection_listener(struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx, > - u64 listener_handle, u64 connector_handle, > - u64 irp_handle) > -{ > - struct pkt_nd_get_connection_listener pkt; > - int ret; > - union ndv_context_handle irp_fhandle; > - > - ret = get_irp_handle(nd_dev, &irp_fhandle.local, (void *)irp_handle); > - > - if (ret) { > - hvnd_error("get_irp_handle() failed: err: %d\n", ret); > - return ret; > - } > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST, 0, 0, > - irp_fhandle.val64); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.version = ND_VERSION_1; > - pkt.ioctl.in.listener_handle = listener_handle; > - pkt.ioctl.in.connector_handle = connector_handle; > - > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - return 0; > - > -err: > - hvnd_error("get connection listener failed: %d\n", ret); > - return ret; > -} > - > -/* > - * Connector APIs. > - */ > - > -int hvnd_cr_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 *connector_handle) > -{ > - struct pkt_nd_cr_connector pkt; > - int ret; > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(struct pkt_nd_cr_listener) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_CONNECTOR_CREATE, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.hdr.version = ND_VERSION_1; > - pkt.ioctl.in.hdr.handle = uctx->adaptor_hdl; > - pkt.ioctl.in.to_semantics = false; > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - /* > - * Copy the listener handle. > - */ > - hvnd_debug("connector handle is %p\n", (void *)pkt.ioctl.out); > - *connector_handle = pkt.ioctl.out; > - > - return 0; > - > -err: > - return ret; > -} > - > -int hvnd_free_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 handle) > -{ > - struct pkt_nd_free_connector pkt; > - int ret; > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_CONNECTOR_FREE, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.version = ND_VERSION_1; > - pkt.ioctl.in.handle = handle; > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - return 0; > - > -err: > - return ret; > -} > - > -int hvnd_bind_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 handle, union nd_sockaddr_inet *addr) > -{ > - struct pkt_nd_bind_connector pkt; > - int ret; > - kuid_t uid = current_uid(); > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_CONNECTOR_BIND, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.hdr.version = ND_VERSION_1; > - pkt.ioctl.in.hdr.handle = handle; > - > - memcpy(&pkt.ioctl.in.hdr.address, addr, sizeof(*addr)); > - > - pkt.ioctl.in.authentication_id = (u32)uid.val; > - pkt.ioctl.in.is_admin = false; > - > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - return 0; > - > -err: > - return ret; > -} > - > -int hvnd_connector_connect(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 connector_handle, u32 in_rd_limit, u32 out_rd_limit, > - u32 priv_data_length, const u8 *priv_data, > - u64 qp_handle, struct if_physical_addr *phys_addr, > - union nd_sockaddr_inet *dest_addr, > - struct hvnd_ep_obj *ep) > -{ > - struct pkt_nd_connector_connect *pkt = &ep->connector_connect_pkt; > - int ret; > - union ndv_context_handle irp_fhandle; > - > - hvnd_debug("local irp is %d\n", ep->local_irp); > - irp_fhandle.local = ep->local_irp; > - > - if (priv_data_length > MAX_PRIVATE_DATA_LEN) { > - hvnd_error("priv_data_length=%d\n", priv_data_length); > - return -EINVAL; > - } > - > - // KYS try to avoid having to zero everything > - memset(pkt, 0, sizeof(*pkt)); > - hvnd_init_hdr(&pkt->hdr, > - sizeof(*pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_CONNECTOR_CONNECT, 0, 0, irp_fhandle.val64); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt->ioctl.in.hdr.version = ND_VERSION_1; > - pkt->ioctl.in.hdr.connector_handle = connector_handle; > - pkt->ioctl.in.hdr.read_limits.inbound = in_rd_limit; > - pkt->ioctl.in.hdr.read_limits.outbound = out_rd_limit; > - pkt->ioctl.in.hdr.cb_private_data_length = priv_data_length; > - pkt->ioctl.in.hdr.cb_private_data_offset = > - offsetof(union connector_connect_ioctl, in.priv_data); > - pkt->ioctl.in.hdr.qp_handle = qp_handle; > - > - memcpy(&pkt->ioctl.in.hdr.phys_addr, phys_addr, > - sizeof(struct if_physical_addr)); > - > - /* > - * Luke's code does not copy the ip address. > - */ > - memcpy(&pkt->ioctl.in.hdr.destination_address, dest_addr, > - sizeof(union nd_sockaddr_inet)); > - > - pkt->ioctl.in.retry_cnt = 7; > - pkt->ioctl.in.rnr_retry_cnt = 7; > - memcpy(pkt->ioctl.in.priv_data, priv_data, priv_data_length); > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt->hdr, sizeof(*pkt), (u64)pkt); > - > - if (ret) > - goto err; > - > - return 0; > - > -err: > - return ret; > -} > - > -int hvnd_connector_complete_connect(struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx, > - u64 connector_handle, > - enum ibv_qp_state *qp_state) > -{ > - struct pkt_nd_connector_connect_complete pkt; > - int ret; > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_CONNECTOR_COMPLETE_CONNECT, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.hdr.version = ND_VERSION_1; > - pkt.ioctl.in.hdr.handle = connector_handle; > - pkt.ioctl.in.rnr_nak_to = 0; > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - *qp_state = pkt.ioctl.out.state; > - return 0; > - > -err: > - return ret; > -} > - > -int hvnd_connector_accept(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 connector_handle, > - u64 qp_handle, > - u32 in_rd_limit, u32 out_rd_limit, > - u32 priv_data_length, const u8 *priv_data, > - enum ibv_qp_state *qp_state, struct hvnd_ep_obj *ep) > -{ > - struct pkt_nd_connector_accept pkt; > - int ret; > - union ndv_context_handle irp_fhandle; > - > - irp_fhandle.local = ep->local_irp; > - > - if (priv_data_length > MAX_PRIVATE_DATA_LEN) { > - hvnd_error("priv_data_length=%d\n", priv_data_length); > - return -EINVAL; > - } > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_CONNECTOR_ACCEPT, 0, 0, irp_fhandle.val64); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.hdr.version = ND_VERSION_1; > - pkt.ioctl.in.hdr.reserved = 0; > - pkt.ioctl.in.hdr.read_limits.inbound = in_rd_limit; > - pkt.ioctl.in.hdr.read_limits.outbound = out_rd_limit; > - pkt.ioctl.in.hdr.cb_private_data_length = priv_data_length; > - > - pkt.ioctl.in.hdr.cb_private_data_offset = > - offsetof(struct connector_accept_in, private_data); > - > - pkt.ioctl.in.hdr.connector_handle = connector_handle; > - pkt.ioctl.in.hdr.qp_handle = qp_handle; > - > - pkt.ioctl.in.rnr_nak_to = 0; > - pkt.ioctl.in.rnr_retry_cnt = 7; > - > - > - memcpy(pkt.ioctl.in.private_data, priv_data, priv_data_length); > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - *qp_state = pkt.ioctl.out.state; > - return 0; > - > -err: > - return ret; > -} > - > -int hvnd_connector_reject(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 connector_handle, > - u32 priv_data_length, u8 *priv_data, > - enum ibv_qp_state *qp_state) > -{ > - struct pkt_nd_connector_reject pkt; > - int ret; > - > - if (priv_data_length > MAX_PRIVATE_DATA_LEN) { > - hvnd_error("priv_data_length=%d\n", priv_data_length); > - return -EINVAL; > - } > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_CONNECTOR_REJECT, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.hdr.version = ND_VERSION_1; > - pkt.ioctl.in.hdr.reserved = 0; > - pkt.ioctl.in.hdr.cb_private_data_length = priv_data_length; > - > - pkt.ioctl.in.hdr.cb_private_data_offset = > - offsetof(struct connector_reject_in, private_data); > - > - pkt.ioctl.in.hdr.connector_handle = connector_handle; > - > - memcpy(pkt.ioctl.in.private_data, priv_data, priv_data_length); > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - *qp_state = pkt.ioctl.out.state; > - return 0; > - > -err: > - return ret; > -} > - > -int hvnd_connector_get_rd_limits(struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx, > - u64 connector_handle, > - struct nd_read_limits *rd_limits) > -{ > - struct pkt_nd_connector_get_rd_limits pkt; > - int ret; > - > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_CONNECTOR_GET_READ_LIMITS, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.in.version = ND_VERSION_1; > - pkt.ioctl.in.in.reserved = 0; > - pkt.ioctl.in.in.handle = connector_handle; > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - *rd_limits = pkt.ioctl.out.out; > - return 0; > - > -err: > - return ret; > -} > - > -int hvnd_connector_get_priv_data(struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx, > - u64 connector_handle, > - u8 *priv_data) > -{ > - struct pkt_nd_connector_get_priv_data pkt; > - int ret; > - > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.version = ND_VERSION_1; > - pkt.ioctl.in.reserved = 0; > - pkt.ioctl.in.handle = connector_handle; > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - memcpy(priv_data, pkt.ioctl.out, MAX_PRIVATE_DATA_LEN); > - return 0; > - > -err: > - return ret; > -} > - > -int hvnd_connector_get_peer_addr(struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx, > - u64 connector_handle, > - union nd_sockaddr_inet *peer_addr) > -{ > - struct pkt_nd_connector_get_peer_addr pkt; > - int ret; > - > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.version = ND_VERSION_1; > - pkt.ioctl.in.reserved = 0; > - pkt.ioctl.in.handle = connector_handle; > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - memcpy(peer_addr, &pkt.ioctl.out, sizeof(union nd_sockaddr_inet)); > - return 0; > - > -err: > - return ret; > -} > - > -int hvnd_connector_get_local_addr(struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx, > - u64 connector_handle, > - union nd_sockaddr_inet *addr) > -{ > - struct pkt_nd_connector_get_addr pkt; > - int ret; > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_CONNECTOR_GET_ADDRESS, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.version = ND_VERSION_1; > - pkt.ioctl.in.reserved = 0; > - pkt.ioctl.in.handle = connector_handle; > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - memcpy(addr, &pkt.ioctl.out, sizeof(union nd_sockaddr_inet)); > - return 0; > - > -err: > - return ret; > -} > - > - > -int hvnd_connector_notify_disconnect(struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx, > - u64 connector_handle, struct hvnd_ep_obj *ep) > -{ > - struct pkt_nd_connector_notify_disconnect pkt; > - int ret; > - union ndv_context_handle irp_fhandle; > - > - irp_fhandle .local = ep->local_irp; > - > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT, 0, 0, > - irp_fhandle.val64); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.version = ND_VERSION_1; > - pkt.ioctl.in.reserved = 0; > - pkt.ioctl.in.handle = connector_handle; > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - return 0; > - > -err: > - return ret; > -} > - > - > -//ASYNCH call > -int hvnd_connector_disconnect(struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx, > - u64 connector_handle, struct hvnd_ep_obj *ep) > -{ > - struct pkt_nd_connector_disconnect pkt; > - int ret; > - union ndv_context_handle irp_fhandle; > - > - irp_fhandle.local = ep->local_irp; > - > - > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_CONNECTOR_DISCONNECT, 0, 0, irp_fhandle.val64); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.version = ND_VERSION_1; > - pkt.ioctl.in.reserved = 0; > - pkt.ioctl.in.handle = connector_handle; > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - return 0; > - > -err: > - return ret; > -} > - > -/* > - * QP operations. > - */ > -int hvnd_create_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - struct hvnd_qp *qp) > -{ > - int ret; > - struct pkt_nd_create_qp *pkt; > - int num_pfn, num_db_pfn; > - int qp_pkt_size; > - unsigned int offset; > - u32 ext_data_offset; > - u32 ext_data_size; > - > - /* > - * Now create QP. > - * First compute the number of PFNs we need to accomodate: > - * One each for door bell and arm_sn and pages in cq buffer. > - */ > - offset = offset_in_page(qp->qp_buf); > - num_pfn = DIV_ROUND_UP(offset + qp->buf_size, PAGE_SIZE); > - > - offset = offset_in_page(qp->db_addr); > - num_db_pfn = DIV_ROUND_UP(offset + 4, PAGE_SIZE); > - > - qp_pkt_size = sizeof(struct pkt_nd_create_qp) + > - (num_pfn * sizeof(u64)); > - > - hvnd_debug("CREATE QP, num pfns is %d\n", num_pfn); > - hvnd_debug("CREATE QP, num DB pfns is %d\n", num_db_pfn); > - > - pkt = kzalloc(qp_pkt_size, GFP_KERNEL); > - > - if (!pkt) > - return -ENOMEM; > - > - hvnd_debug("offset of nd_create_qp is %d\n", > - (int)offsetof(struct pkt_nd_create_qp, ioctl.input)); > - > - ext_data_offset = offsetof(struct pkt_nd_create_qp, ext_data) - > - sizeof(struct ndv_packet_hdr_control_1); > - > - ext_data_size = sizeof(struct create_qp_ext_data) + > - (num_pfn * sizeof(u64)); > - > - hvnd_init_hdr(&pkt->hdr, > - qp_pkt_size - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_QP_CREATE, > - ext_data_size, > - ext_data_offset, > - 0); > - > - /* > - * Now fill out the ioctl section. > - */ > - > - pkt->ioctl.input.hdr.version = ND_VERSION_1; > - > - if (qp->max_inline_data > nd_dev->query_pkt.ioctl.ad_info.inline_request_threshold) > - qp->max_inline_data = nd_dev->query_pkt.ioctl.ad_info.inline_request_threshold; > - pkt->ioctl.input.hdr.cb_max_inline_data = qp->max_inline_data; > - > - hvnd_debug("pkt->ioctl.input.hdr.cb_max_inline_data=%d\n", > - pkt->ioctl.input.hdr.cb_max_inline_data); > - > - pkt->ioctl.input.hdr.ce_mapping_cnt = MLX4_IB_CREATE_QP_MAPPINGS_MAX; > - pkt->ioctl.input.hdr.cb_mapping_offset = sizeof(union create_qp_ioctl); > - > - pkt->ioctl.input.hdr.initiator_queue_depth = qp->initiator_q_depth; > - pkt->ioctl.input.hdr.max_initiator_request_sge = > - qp->initiator_request_sge; > - > - hvnd_debug("recv cq handle is %p\n", (void *)qp->receive_cq_handle); > - hvnd_debug("send cq handle is %p\n", (void *)qp->initiator_cq_handle); > - hvnd_debug("pd handle is %p\n", (void *)qp->pd_handle); > - pkt->ioctl.input.hdr.receive_cq_handle = qp->receive_cq_handle; > - pkt->ioctl.input.hdr.initiator_cq_handle = qp->initiator_cq_handle; > - pkt->ioctl.input.hdr.pd_handle = qp->pd_handle; > - > - > - hvnd_debug("ce_mapping cnt is %d\n", > - pkt->ioctl.input.hdr.ce_mapping_cnt); > - hvnd_debug("cb_mapping offset is %d\n", > - pkt->ioctl.input.hdr.cb_mapping_offset); > - > - pkt->ioctl.input.receive_queue_depth = qp->receive_q_depth; > - pkt->ioctl.input.max_receive_request_sge = qp->receive_request_sge; > - > - > - pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_BUF].map_memory.map_type = ND_MAP_MEMORY; > - pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_BUF].map_memory.access_type = ND_MODIFY_ACCESS; > - pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_BUF].map_memory.address = (u64)qp->qp_buf; > - pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_BUF].map_memory.cb_length = qp->buf_size; > - > - pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_DB].map_memory.map_type = ND_MAP_MEMORY_COALLESCE; > - pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_DB].map_memory.access_type = ND_WRITE_ACCESS; > - pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_DB].map_memory.address = (u64)qp->db_addr; > - pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_DB].map_memory.cb_length = 4; > - > - pkt->mappings.qp_in.log_sq_bb_count = qp->log_sq_bb_count; > - pkt->mappings.qp_in.log_sq_stride = qp->log_sq_stride; > - pkt->mappings.qp_in.sq_no_prefetch = qp->sq_no_prefetch; > - > - > - /* > - * Fill in the extended data. > - */ > - > - pkt->ext_data.cnt = 2; > - pkt->ext_data.fields[MLX4_IB_CREATE_QP_BUF].size = > - sizeof(struct gpa_range) + (num_pfn * sizeof(u64)); > - pkt->ext_data.fields[MLX4_IB_CREATE_QP_BUF].offset = > - offsetof(struct create_qp_ext_data, qpbuf_gpa); > - > - pkt->ext_data.fields[MLX4_IB_CREATE_QP_DB].size = > - sizeof(struct qp_db_gpa); > - pkt->ext_data.fields[MLX4_IB_CREATE_QP_DB].offset = > - offsetof(struct create_qp_ext_data, db_gpa); > - > - /* > - * Fill up the gpa range for qp buffer. > - */ > - > - pkt->ext_data.db_gpa.byte_count = 4; // KYS 8 or 16? > - pkt->ext_data.db_gpa.byte_offset = offset_in_page(qp->db_addr); > - user_va_init_pfn(&pkt->ext_data.db_gpa.pfn_array[0], qp->db_umem); > - > - pkt->ext_data.qpbuf_gpa.byte_count = qp->buf_size; > - pkt->ext_data.qpbuf_gpa.byte_offset = offset_in_page(qp->qp_buf); > - user_va_init_pfn(&pkt->ext_data.qpbuf_gpa.pfn_array[0], qp->umem); > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt->hdr, qp_pkt_size, (u64)pkt); > - > - if (ret) > - goto cr_qp_err; > - > - /* > - * Copy the necessary response from the host. > - */ > - qp->qp_handle = pkt->ioctl.resrc_desc.handle; > - > - qp->qpn = pkt->mappings.qp_resp.qpn; > - qp->max_send_wr = pkt->mappings.qp_resp.max_send_wr; > - qp->max_recv_wr = pkt->mappings.qp_resp.max_recv_wr; > - qp->max_send_sge = pkt->mappings.qp_resp.max_send_sge; > - qp->max_recv_sge = pkt->mappings.qp_resp.max_recv_sge; > - > - > - hvnd_debug("qp->max_send_wr=%d max_recv_wr=%d max_send_sge=%d max_recv_sge=%d max_inline_data=%d\n", > - qp->max_send_wr, qp->max_recv_wr, qp->max_send_sge, > - qp->max_recv_sge, qp->max_inline_data); > - > - ret = insert_handle(nd_dev, &nd_dev->qpidr, qp, qp->qpn); > - > - if (ret) > - goto cr_qp_err; > - > - hvnd_debug("QP create after success qpn:%d qp:%p handle:%llu\n", > - qp->qpn, qp, qp->qp_handle); > - > -cr_qp_err: > - kfree(pkt); > - return ret; > -} > - > -int hvnd_free_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - struct hvnd_qp *qp) > -{ > - int ret; > - > - ret = hvnd_free_handle(nd_dev, uctx, qp->qp_handle, IOCTL_ND_QP_FREE); > - if (ret == 0) > - remove_handle(nd_dev, &nd_dev->qpidr, qp->qpn); > - return ret; > -} > - > -int hvnd_flush_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - struct hvnd_qp *qp) > -{ > - struct pkt_nd_flush_qp pkt; > - int ret; > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_QP_FLUSH, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.version = ND_VERSION_1; > - pkt.ioctl.in.reserved = 0; > - pkt.ioctl.in.handle = qp->qp_handle; > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - return 0; > - > -err: > - return ret; > -} > - > - > -int hvnd_bind_nic(struct hvnd_dev *nd_dev, bool un_bind, char *ip_addr, > - char *mac_addr) > -{ > - int ret; > - int pkt_type = NDV_PKT_ID1_BIND; > - > - /* > - * Send the bind information over to the host. > - * For now, we will have a single ip and MAC address that we > - * will deal with. Down the road we will need to expand support > - * for multiple IP and MAC addresses and also deal with changing > - * IP addresses. > - */ > - > - NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); > - hvnd_debug("bind packet type is %d ID:%d\n", > - pkt_type, NDV_PACKET_TYPE_ID(pkt_type)); > - nd_dev->bind_pkt.pkt_hdr.packet_type = pkt_type; > - > - nd_dev->bind_pkt.pkt_hdr.hdr_sz = sizeof(struct ndv_pkt_hdr_bind_1); > - hvnd_debug("bind packet size is %d\n", > - (int)sizeof(struct ndv_pkt_hdr_bind_1)); > - nd_dev->bind_pkt.pkt_hdr.data_sz = 0; > - nd_dev->bind_pkt.unbind = un_bind; > - nd_dev->bind_pkt.ip_address.address_family = AF_INET; > - nd_dev->bind_pkt.ip_address.ipv4.sin_family = AF_INET; > - nd_dev->bind_pkt.ip_address.ipv4.sin_port = 0; > - nd_dev->bind_pkt.ip_address.ipv4.sin_addr.s_addr = *(unsigned int *)ip_addr; > - > - nd_dev->bind_pkt.phys_addr.length = ETH_ALEN; > - memcpy(nd_dev->bind_pkt.phys_addr.addr, mac_addr, ETH_ALEN); > - > - /* > - * This is the adapter handle; needs to be unique for each > - * MAC, ip address tuple. > - */ > - nd_dev->bind_pkt.guest_id = (u64)nd_dev; > - > - ret = hvnd_send_packet(nd_dev, &nd_dev->bind_pkt, > - sizeof(struct ndv_pkt_hdr_bind_1), > - (u64)NULL, > - true); > - return ret; > -} > - > -int hvnd_init_resources(struct hvnd_dev *nd_dev) > -{ > - unsigned long mmio_sz; > - struct resource *resrc; > - int ret = -ENOMEM; > - > - resrc = &iomem_resource; > - > - mmio_sz = nd_dev->hvdev->channel->offermsg.offer.mmio_megabytes * > - 1024 * 1024; > - nd_dev->mmio_sz = mmio_sz; > - nd_dev->mmio_resource.name = KBUILD_MODNAME; > - nd_dev->mmio_resource.flags = IORESOURCE_MEM | IORESOURCE_BUSY; > - > - ret = allocate_resource(resrc, &nd_dev->mmio_resource, > - mmio_sz, 0, -1, mmio_sz, NULL, NULL); > - > - if (ret) { > - hvnd_error("Unable to allocate mmio resources\n"); > - return ret; > - } > - hvnd_debug("MMIO start is %p\n", (void *)nd_dev->mmio_resource.start); > - > - /* > - * Send the mmio information over to the host. > - */ > - nd_dev->resources.pkt_hdr.packet_type = NDV_PKT_ID1_INIT_RESOURCES; > - nd_dev->resources.pkt_hdr.hdr_sz = sizeof(union ndv_packet_hdr); > - nd_dev->resources.pkt_hdr.data_sz = 0; > - > - nd_dev->resources.io_space_sz_mb = mmio_sz; > - nd_dev->resources.io_space_start = nd_dev->mmio_resource.start; > - > - ret = hvnd_send_packet(nd_dev, &nd_dev->resources, > - sizeof(struct ndv_pkt_hdr_init_resources_1), > - (u64)NULL, > - true); > - return ret; > -} > - > -int hvnd_query_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx) > -{ > - struct pkt_nd_query_adaptor *pkt; > - int ret; > - int pkt_type; > - struct ib_device_attr *props = &nd_dev->ibdev.attrs; > - struct adapter_info_v2 *adap_info = &nd_dev->query_pkt.ioctl.ad_info; > - > - hvnd_debug("Performing Adapter query nd_dev=%p\n", nd_dev); > - > - // check if there is a need to do query > - if (nd_dev->query_pkt_set) > - return 0; > - > - // need a lock, multiple process can call this at the same time > - > - down(&nd_dev->query_pkt_sem); > - if (nd_dev->query_pkt_set) { > - up(&nd_dev->query_pkt_sem); > - return 0; > - } > - > - /* > - * Now query the adaptor. > - */ > - > - pkt = &nd_dev->query_pkt; > - > - pkt_type = NDV_PKT_ID1_CONTROL; > - NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); > - > - pkt->hdr.pkt_hdr.packet_type = pkt_type; > - pkt->hdr.pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); > - pkt->hdr.pkt_hdr.data_sz = sizeof(struct pkt_nd_query_adaptor) - > - sizeof(struct ndv_packet_hdr_control_1); > - > - > - pkt->hdr.file_handle.local = uctx->file_handle.local; > - pkt->hdr.file_handle.remote = uctx->file_handle.remote; > - > - pkt->hdr.irp_handle.val64 = 0; > - > - pkt->hdr.io_cntrl_code = IOCTL_ND_ADAPTER_QUERY; > - pkt->hdr.output_buf_sz = sizeof(struct nd_adap_query_ioctl); > - pkt->hdr.input_buf_sz = sizeof(struct nd_adap_query_ioctl); > - pkt->hdr.input_output_buf_offset = 0; > - memset(&pkt->ioctl.ad_q, 0, sizeof(struct nd_adap_query_ioctl)); > - > - pkt->ioctl.ad_q.version = ND_VERSION_1; > - pkt->ioctl.ad_q.info_version = ND_VERSION_2; > - pkt->ioctl.ad_q.adapter_handle = uctx->adaptor_hdl; > - > - ret = hvnd_send_packet(nd_dev, pkt, > - sizeof(struct pkt_nd_query_adaptor), > - (unsigned long)pkt, true); > - > - hvnd_debug("pkt->ioctl.ad_info.inline_request_threshold=%d\n", > - pkt->ioctl.ad_info.inline_request_threshold); > - nd_dev->query_pkt_set = true; > - > - // how about host returning PENDING > - up(&nd_dev->query_pkt_sem); > - > - if (ret) > - return ret; > - > - hvnd_debug("Query Adaptor Succeeded\n"); > - > - /* > - * Cache the relevant properties out. > - */ > - props->fw_ver = 0; > - props->device_cap_flags = 0; > - props->vendor_id = 0x15b3; > - props->vendor_part_id = adap_info->device_id; > - props->max_mr_size = ~0ull; > - props->page_size_cap = PAGE_SIZE; > - props->max_qp = 16384; > - props->max_qp_wr = min(adap_info->max_recv_q_depth, > - adap_info->max_initiator_q_depth); > - props->max_sge = min(adap_info->max_initiator_sge, > - adap_info->max_recv_sge); > - props->max_cq = 0x1FFFF; > - props->max_cqe = adap_info->max_completion_q_depth; > - props->max_mr = 16384; > - props->max_pd = 16384; > - props->max_qp_rd_atom = adap_info->max_inbound_read_limit; > - props->max_qp_init_rd_atom = adap_info->max_outbound_read_limit; > - props->max_res_rd_atom = props->max_qp_rd_atom * props->max_qp; > - props->max_srq = 16384; > - props->max_srq_wr = adap_info->max_recv_q_depth; > - props->max_srq_sge = adap_info->max_recv_sge; > - return 0; > -} > - > - > -int hvnd_create_pd(struct hvnd_ucontext *uctx, struct hvnd_dev *nd_dev, > - struct hvnd_ib_pd *hvnd_pd) > -{ > - struct pkt_nd_pd_create *pkt = &uctx->pd_cr_pkt; > - int ret; > - int pkt_type; > - > - hvnd_debug("Create Protection Domain\n"); > - > - pkt_type = NDV_PKT_ID1_CONTROL; > - NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); > - > - pkt->hdr.pkt_hdr.packet_type = pkt_type; > - pkt->hdr.pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); > - pkt->hdr.pkt_hdr.data_sz = sizeof(struct pkt_nd_pd_create) - > - sizeof(struct ndv_packet_hdr_control_1); > - > - hvnd_debug("pdcreate packet size: %d\n", > - (int)sizeof(struct pkt_nd_pd_create)); > - hvnd_debug("pdcreate hdr size: %d\n", > - (int)sizeof(struct ndv_packet_hdr_control_1)); > - hvnd_debug("pdcreate data size: %d\n", pkt->hdr.pkt_hdr.data_sz); > - > - pkt->hdr.file_handle.local = uctx->create_pkt.handle.local; > - pkt->hdr.file_handle.remote = uctx->create_pkt.handle.remote; > - > - hvnd_debug("create pd uctx is %p\n", uctx); > - hvnd_debug("create pd local file is %d\n", > - uctx->create_pkt.handle.local); > - hvnd_debug("create pd local file is %d\n", > - uctx->create_pkt.handle.remote); > - > - pkt->hdr.irp_handle.val64 = 0; > - pkt->hdr.io_cntrl_code = IOCTL_ND_PD_CREATE; > - > - pkt->hdr.output_buf_sz = sizeof(struct nd_create_pd_ioctl); > - pkt->hdr.input_buf_sz = sizeof(struct nd_create_pd_ioctl); > - pkt->hdr.input_output_buf_offset = 0; > - > - hvnd_debug("output/input buf size: %d\n", pkt->hdr.output_buf_sz); > - > - /* > - * Fill the ioctl section. > - */ > - pkt->ioctl.in.version = ND_VERSION_1; > - pkt->ioctl.in.reserved = 0; > - pkt->ioctl.in.handle = uctx->adaptor_hdl; > - > - > - ret = hvnd_send_packet(nd_dev, pkt, > - sizeof(struct pkt_nd_pd_create), > - (unsigned long)pkt, true); > - > - if (ret) > - return ret; > - > - if (pkt->hdr.pkt_hdr.status != 0) { > - hvnd_error("Create PD failed; status is %d\n", > - pkt->hdr.pkt_hdr.status); > - return -EINVAL; > - } > - if (pkt->hdr.io_status != 0) { > - hvnd_error("Create PD failed;io status is %d\n", > - pkt->hdr.io_status); > - return -EINVAL; > - } > - > - hvnd_debug("Create PD Succeeded\n"); > - > - hvnd_debug("pd_handle is %p\n", (void *)pkt->ioctl.resp.pd_handle); > - hvnd_debug("pdn is %d\n", (int)pkt->ioctl.resp.pdn); > - > - hvnd_pd->pdn = pkt->ioctl.resp.pdn; > - hvnd_pd->handle = pkt->ioctl.out_handle; > - > - return 0; > -} > - > -int hvnd_cancel_io(struct hvnd_ep_obj *ep_object) > -{ > - struct pkt_nd_cancel_io pkt; > - int ret; > - u32 ioctl; > - > - switch (ep_object->type) { > - case ND_LISTENER: > - hvnd_debug("LISTENER I/O Cancelled\n"); > - ioctl = IOCTL_ND_LISTENER_CANCEL_IO; > - break; > - case ND_CONNECTOR: > - hvnd_debug("CONNECTOR I/O Cancelled\n"); > - ioctl = IOCTL_ND_CONNECTOR_CANCEL_IO; > - break; > - case ND_MR: > - hvnd_debug("MR I/O Cancelled\n"); > - ioctl = IOCTL_ND_MR_CANCEL_IO; > - break; > - case ND_CQ: > - hvnd_debug("CQ I/O Cancelled\n"); > - ioctl = IOCTL_ND_CQ_CANCEL_IO; > - break; > - default: > - hvnd_error("UNKNOWN object type\n"); > - return -EINVAL; > - } > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - ep_object->uctx->create_pkt.handle.local, > - ep_object->uctx->create_pkt.handle.remote, > - ioctl, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.version = ND_VERSION_1; > - pkt.ioctl.in.reserved = 0; > - pkt.ioctl.in.handle = ep_object->ep_handle; > - hvnd_debug("cancel io handle is %p\n", (void *)ep_object->ep_handle); > - > - ret = hvnd_send_ioctl_pkt(ep_object->nd_dev, &pkt.hdr, > - sizeof(pkt), > - (u64)&pkt); > - > - if (ret) > - goto err; > - > - /* > - * Now that we have cancelled all I/Os, > - */ > - > - return 0; > - > -err: > - hvnd_error("cancel I/O operation failed\n"); > - return ret; > -} > - > - > -int hvnd_free_handle(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 handle, u32 ioctl) > -{ > - struct pkt_nd_free_handle pkt; > - int ret; > - > - hvnd_debug("Freeing handle ioctl is %s; handle is %p\n", > - hvnd_get_op_name(ioctl), (void *)handle); > - > - hvnd_debug("uctx is %p\n", uctx); > - hvnd_debug("nd_dev is %p\n", nd_dev); > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - ioctl, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.version = ND_VERSION_1; > - pkt.ioctl.in.reserved = 0; > - pkt.ioctl.in.handle = handle; > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - return 0; > - > -err: > - hvnd_error("%s: ret=%d\n", __func__, ret); > - return ret; > -} > - > -int hvnd_negotiate_version(struct hvnd_dev *nd_dev) > -{ > - union ndv_packet_init *pkt = &nd_dev->init_pkt; > - int ret; > - > - nd_dev->negotiated_version = NDV_PROTOCOL_VAERSION_INVALID; > - > - pkt->packet_type = NDV_PACKET_TYPE_INIT; > - pkt->protocol_version = NDV_PROTOCOL_VERSION_CURRENT; > - pkt->flags = 0; // KYS are the flags 0? > - > - ret = hvnd_send_packet(nd_dev, pkt, > - sizeof(union ndv_packet_init), (u64)NULL, true); > - > - return ret; > -} > - > -void hvnd_callback(void *context) > -{ > - struct hv_device *dev = context; > - struct hvnd_dev *nd_dev = hv_get_drvdata(dev); > - int copy_sz = 0; > - struct ndv_packet_hdr_control_1 *ctrl_hdr; > - union ndv_packet_init *pkt_init; > - u32 recvlen; > - u32 local_irp; > - u64 requestid; > - u32 *pkt_type; > - u32 pkt_id; > - struct hvnd_ep_obj *ep_object; > - struct incoming_pkt *incoming_pkt; /* Used only for asynch calls */ > - char *incoming_pkt_start; > - struct vmpacket_descriptor *desc; > - int status; > - struct hvnd_cookie *hvnd_cookie; > - unsigned long flags; > - > - vmbus_recvpacket_raw(dev->channel, hvnd_recv_buffer, > - (PAGE_SIZE * 4), &recvlen, &requestid); > - > - if (recvlen <= 0) > - return; > - > - desc = (struct vmpacket_descriptor *)hvnd_recv_buffer; > - incoming_pkt_start = hvnd_recv_buffer + (desc->offset8 << 3); > - recvlen -= desc->offset8 << 3; > - > - pkt_type = (u32 *)incoming_pkt_start; > - pkt_id = *pkt_type; > - if (pkt_id != NDV_PACKET_TYPE_INIT) > - pkt_id = NDV_PACKET_TYPE_ID(pkt_id); > - > - switch (pkt_id) { > - case NDV_PACKET_TYPE_INIT: > - /* > - * Host is responding to our init packet. > - */ > - pkt_init = (union ndv_packet_init *)incoming_pkt_start; > - nd_dev->negotiated_version = pkt_init->protocol_version; > - copy_sz = 0; > - break; > - > - case NDV_PKT_ID1_INIT_RESOURCES: > - copy_sz = 0; > - break; > - > - case NDV_PKT_ID1_BIND: > - nd_dev->bind_pkt.pkt_hdr.status = > - ((union ndv_packet_hdr *) incoming_pkt_start)->status; > - copy_sz = 0; > - break; > - > - case NDV_PKT_ID1_COMPLETE: > - ctrl_hdr = (struct ndv_packet_hdr_control_1 *)incoming_pkt_start; > - status = ctrl_hdr->io_status; > - > - local_irp = ctrl_hdr->irp_handle.local; > - ep_object = (struct hvnd_ep_obj *)map_irp_to_ctx(nd_dev, > - local_irp); > - > - if (!ep_object) { > - hvnd_error("irp could not be mapped; irp is %d ioctl is %s", > - local_irp, > - hvnd_get_op_name(ctrl_hdr->io_cntrl_code)); > - goto complete; > - } > - > - if (ctrl_hdr->io_cntrl_code != IOCTL_ND_CQ_NOTIFY) > - hvnd_debug("completion packet; iostatus is %x, ioctl is %s", > - ctrl_hdr->io_status, > - hvnd_get_op_name(ctrl_hdr->io_cntrl_code)); > - > - switch (ctrl_hdr->io_cntrl_code) { > - > - case IOCTL_ND_CQ_NOTIFY: > - hvnd_process_cq_event_complete(ep_object, status); > - > - ep_del_work_pending(ep_object); > - goto complete; > - > - case IOCTL_ND_CONNECTOR_ACCEPT: > - > - hvnd_process_connector_accept(ep_object, status); > - > - ep_del_work_pending(ep_object); > - goto complete; > - > - case IOCTL_ND_CONNECTOR_DISCONNECT: > - hvnd_debug("disconnected: ep opj is %p; status: %d\n", > - ep_object, status); > - hvnd_process_disconnect(ep_object, status); > - > - ep_del_work_pending(ep_object); > - goto complete; > - > - default: > - break; > - } > - > - /* > - * This is the completion notification; > - * the IRP cookie is the state through which > - * we will invoke the callback. > - */ > - incoming_pkt = kmalloc(recvlen + sizeof(struct incoming_pkt), > - GFP_ATOMIC); > - if (incoming_pkt == NULL) { > - hvnd_error("Could not alloc memory in callback\n"); > - ep_del_work_pending(ep_object); > - goto complete; > - } > - memcpy(incoming_pkt->pkt, incoming_pkt_start, recvlen); > - > - spin_lock_irqsave(&ep_object->incoming_pkt_list_lock, flags); > - list_add_tail(&incoming_pkt->list_entry, > - &ep_object->incoming_pkt_list); > - spin_unlock_irqrestore(&ep_object->incoming_pkt_list_lock, > - flags); > - > - schedule_work(&ep_object->wrk.work); > - > - goto complete; > - > - case NDV_PKT_ID1_CREATE: > - copy_sz = sizeof(struct ndv_pkt_hdr_create_1); > - break; > - > - case NDV_PKT_ID1_CLEANUP: > - copy_sz = sizeof(struct ndv_pkt_hdr_cleanup_1); > - break; > - > - case NDV_PKT_ID1_CONTROL: > - ctrl_hdr = (struct ndv_packet_hdr_control_1 *)incoming_pkt_start; > - status = ctrl_hdr->io_status; > - > - if (ctrl_hdr->io_cntrl_code != IOCTL_ND_CQ_NOTIFY) > - hvnd_debug("packet; iostatus is %x ioctl is %s", > - ctrl_hdr->io_status, hvnd_get_op_name(ctrl_hdr->io_cntrl_code)); > - > - switch (ctrl_hdr->io_cntrl_code) { > - > - case IOCTL_ND_PROVIDER_INIT: > - copy_sz = sizeof(struct pkt_nd_provider_ioctl); > - break; > - > - case IOCTL_ND_PROVIDER_BIND_FILE: > - copy_sz = sizeof(struct pkt_nd_provider_ioctl); > - break; > - > - case IOCTL_ND_ADAPTER_OPEN: > - copy_sz = sizeof(struct pkt_nd_open_adapter); > - break; > - > - case IOCTL_ND_ADAPTER_CLOSE: > - copy_sz = sizeof(struct pkt_nd_free_handle); > - break; > - > - case IOCTL_ND_ADAPTER_QUERY: > - copy_sz = sizeof(struct pkt_nd_query_adaptor); > - break; > - > - case IOCTL_ND_PD_CREATE: > - copy_sz = sizeof(struct pkt_nd_pd_create); > - break; > - > - case IOCTL_ND_PD_FREE: > - copy_sz = sizeof(struct pkt_nd_free_handle); > - break; > - > - case IOCTL_ND_CQ_CREATE: > - copy_sz = sizeof(struct pkt_nd_create_cq); > - break; > - > - case IOCTL_ND_CQ_FREE: > - copy_sz = sizeof(struct pkt_nd_free_cq); > - break; > - > - case IOCTL_ND_CQ_NOTIFY: //FIXME check ep stop state > - local_irp = ctrl_hdr->irp_handle.local; > - ep_object = (struct hvnd_ep_obj *) > - map_irp_to_ctx(nd_dev, local_irp); > - if (!ep_object) { > - hvnd_error("irp could not be mapped\n"); > - goto complete; > - return; > - } > - copy_sz = sizeof(struct pkt_nd_notify_cq); > - hvnd_process_cq_event_pending(ep_object, status); > - goto complete; > - return; > - > - case IOCTL_ND_LISTENER_CREATE: > - copy_sz = sizeof(struct pkt_nd_cr_listener); > - break; > - > - case IOCTL_ND_LISTENER_FREE: > - copy_sz = sizeof(struct pkt_nd_free_listener); > - break; > - > - case IOCTL_ND_QP_FREE: > - copy_sz = sizeof(struct pkt_nd_free_handle); > - break; > - > - case IOCTL_ND_CONNECTOR_CANCEL_IO: > - case IOCTL_ND_MR_CANCEL_IO: > - case IOCTL_ND_CQ_CANCEL_IO: > - case IOCTL_ND_LISTENER_CANCEL_IO: > - copy_sz = sizeof(struct pkt_nd_cancel_io); > - break; > - > - case IOCTL_ND_LISTENER_BIND: > - copy_sz = sizeof(struct pkt_nd_bind_listener); > - break; > - > - case IOCTL_ND_LISTENER_LISTEN: > - copy_sz = sizeof(struct pkt_nd_listen_listener); > - break; > - > - case IOCTL_ND_LISTENER_GET_ADDRESS: > - copy_sz = sizeof(struct pkt_nd_get_addr_listener); > - break; > - > - case IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST: > - copy_sz = sizeof(struct pkt_nd_get_connection_listener); > - goto complete; // non-block > - > - case IOCTL_ND_CONNECTOR_CREATE: > - copy_sz = sizeof(struct pkt_nd_cr_connector); > - break; > - > - case IOCTL_ND_CONNECTOR_FREE: > - copy_sz = sizeof(struct pkt_nd_free_connector); > - break; > - > - case IOCTL_ND_CONNECTOR_BIND: > - copy_sz = sizeof(struct pkt_nd_free_connector); > - break; > - > - // KYS: ALERT: ASYNCH Operation > - case IOCTL_ND_CONNECTOR_CONNECT: > - copy_sz = sizeof(struct pkt_nd_connector_connect); > - goto complete; //non-block > - > - case IOCTL_ND_CONNECTOR_COMPLETE_CONNECT: > - copy_sz = sizeof(struct pkt_nd_connector_connect_complete); > - break; > - > - // KYS: ALERT: ASYNCH Operation > - case IOCTL_ND_CONNECTOR_ACCEPT: > - copy_sz = sizeof(struct pkt_nd_connector_accept); > - goto complete; //non-block > - > - case IOCTL_ND_CONNECTOR_REJECT: > - copy_sz = sizeof(struct pkt_nd_connector_reject); > - break; > - > - case IOCTL_ND_CONNECTOR_GET_READ_LIMITS: > - copy_sz = sizeof(struct pkt_nd_connector_get_rd_limits); > - break; > - > - case IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA: > - copy_sz = sizeof(struct pkt_nd_connector_get_priv_data); > - break; > - > - case IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS: > - copy_sz = sizeof(struct pkt_nd_connector_get_peer_addr); > - break; > - > - case IOCTL_ND_CONNECTOR_GET_ADDRESS: > - copy_sz = sizeof(struct pkt_nd_connector_get_addr); > - break; > - > - // KYS: ALERT: ASYNCH Operation > - case IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT: > - copy_sz = sizeof(struct pkt_nd_connector_notify_disconnect); > - goto complete; //non-block > - > - // KYS: ALERT: ASYNCH Operation > - case IOCTL_ND_CONNECTOR_DISCONNECT: > - hvnd_debug("IOCTL_ND_CONNECTOR_DISCONNECT\n"); > - copy_sz = sizeof(struct pkt_nd_connector_notify_disconnect); > - goto complete; // non-block > - > - case IOCTL_ND_QP_CREATE: > - copy_sz = sizeof(struct pkt_nd_create_qp); > - break; > - > - case IOCTL_ND_MR_CREATE: > - copy_sz = sizeof(struct pkt_nd_create_mr); > - break; > - > - case IOCTL_ND_MR_FREE: > - copy_sz = sizeof(struct pkt_nd_free_handle); > - break; > - > - case IOCTL_ND_MR_REGISTER: > - copy_sz = sizeof(struct pkt_nd_register_mr); > - break; > - > - case IOCTL_ND_MR_DEREGISTER: > - copy_sz = sizeof(struct pkt_nd_deregister_mr); > - break; > - > - case IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST: > - copy_sz = sizeof(struct pkt_query_addr_list); > - break; > - > - case IOCTL_ND_QP_FLUSH: > - copy_sz = sizeof(struct pkt_nd_flush_qp); > - break; > - > - default: > - hvnd_warn("Got unknown ioctl: %d\n", > - ctrl_hdr->io_cntrl_code); > - copy_sz = 0; > - break; > - } > - > - break; > - default: > - hvnd_warn("Got an unknown packet type %d\n", *pkt_type); > - break; > - } > - > - hvnd_cookie = (struct hvnd_cookie *)requestid; > - memcpy(hvnd_cookie->pkt, incoming_pkt_start, copy_sz); > - complete(&hvnd_cookie->host_event); > - > -complete: > - /* send out ioctl completion patcket */ > - if (desc->flags & VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED) { > - int retry = 5; > - > - while (true) { > - int ret; > - > - ret = vmbus_sendpacket(dev->channel, NULL, 0, > - requestid, VM_PKT_COMP, 0); > - if (ret == 0) { > - break; > - } else if (ret == -EAGAIN) { > - if (--retry == 0) { > - hvnd_error("give up retrying send completion packet\n"); > - break; > - } > - hvnd_warn("retrying send completion packet\n"); > - udelay(100); > - } else { > - hvnd_error("unable to send completion packet ret=%d\n", > - ret); > - break; > - } > - } > - } > - > -} > diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/vmbus_rdma.h b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/vmbus_rdma.h > deleted file mode 100644 > index 78e534fa8748..000000000000 > --- a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/vmbus_rdma.h > +++ /dev/null > @@ -1,2202 +0,0 @@ > -/* > - * Copyright (c) 2014, Microsoft Corporation. > - * > - * Author: > - * K. Y. Srinivasan <kys@microsoft.com> > - * > - * This program is free software; you can redistribute it and/or modify it > - * under the terms of the GNU General Public License version 2 as published > - * by the Free Software Foundation. > - * > - * This program is distributed in the hope that it will be useful, but > - * WITHOUT ANY WARRANTY; without even the implied warranty of > - * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or > - * NON INFRINGEMENT. See the GNU General Public License for more > - * details. > - * > - * Bug fixes/enhancements: Long Li <longli@microsoft.com> > - */ > - > - > -#ifndef _VMBUS_RDMA_H > -#define _VMBUS_RDMA_H > - > - > -#include <linux/in.h> > -#include <linux/in6.h> > -#include <rdma/ib_verbs.h> > -#include <linux/idr.h> > -#include <linux/if_ether.h> > - > -/* NetworkDirect version Numbers. > - */ > -#define ND_VERSION_1 0x1 > -#define ND_VERSION_2 0x20000 > - > -#ifndef NDVER > -#define NDVER ND_VERSION_2 > -#endif > - > -#define ND_ADAPTER_FLAG_IN_ORDER_DMA_SUPPORTED 0x00000001 > -#define ND_ADAPTER_FLAG_CQ_INTERRUPT_MODERATION_SUPPORTED 0x00000004 > -#define ND_ADAPTER_FLAG_MULTI_ENGINE_SUPPORTED 0x00000008 > -#define ND_ADAPTER_FLAG_CQ_RESIZE_SUPPORTED 0x00000100 > -#define ND_ADAPTER_FLAG_LOOPBACK_CONNECTIONS_SUPPORTED 0x00010000 > - > -#define ND_CQ_NOTIFY_ERRORS 0 > -#define ND_CQ_NOTIFY_ANY 1 > -#define ND_CQ_NOTIFY_SOLICITED 2 > - > -#define ND_MR_FLAG_ALLOW_LOCAL_WRITE 0x00000001 > -#define ND_MR_FLAG_ALLOW_REMOTE_READ 0x00000002 > -#define ND_MR_FLAG_ALLOW_REMOTE_WRITE 0x00000005 > -#define ND_MR_FLAG_RDMA_READ_SINK 0x00000008 > -#define ND_MR_FLAG_DO_NOT_SECURE_VM 0x80000000 > - > -#define ND_OP_FLAG_SILENT_SUCCESS 0x00000001 > -#define ND_OP_FLAG_READ_FENCE 0x00000002 > -#define ND_OP_FLAG_SEND_AND_SOLICIT_EVENT 0x00000004 > -#define ND_OP_FLAG_ALLOW_READ 0x00000008 > -#define ND_OP_FLAG_ALLOW_WRITE 0x00000010 > - > -#if NDVER >= ND_VERSION_2 > -#define ND_OP_FLAG_INLINE 0x00000020 > -#endif > - > -#define ND_AF_INET6 23 > -#define IF_MAX_ADDR_LENGTH 32 > - > -struct group_affinity { > - u64 mask; //KYS: usually 0 > - u16 group; // KYS usually -1 > - u16 reserved[3]; > -}; > - > -struct if_physical_addr { > - u16 length; > - u8 addr[IF_MAX_ADDR_LENGTH]; > -}; > - > -struct adapter_info_v2 { > - u32 info_version; > - u16 vendor_id; > - u16 device_id; > - u64 adapter_id; > - size_t max_registration_size; > - size_t max_window_size; > - u32 max_initiator_sge; > - u32 max_recv_sge; > - u32 max_read_sge; > - u32 max_transfer_length; > - u32 max_inline_data_size; > - u32 max_inbound_read_limit; > - u32 max_outbound_read_limit; > - u32 max_recv_q_depth; > - u32 max_initiator_q_depth; > - u32 max_shared_recv_q_depth; > - u32 max_completion_q_depth; > - u32 inline_request_threshold; > - u32 large_request_threshold; > - u32 max_caller_data; > - u32 max_callee_data; > - u32 adapter_flags; > -} __packed; > - > -struct nd2_adapter_info_32 { //KYS: Check what this is > - u32 info_version; > - u16 vendor_id; > - u16 devic_id; > - u64 adapter_id; > - u32 max_registration_size; > - u32 max_window_size; > - u32 max_initiator_sge; > - u32 max_recv_sge; > - u32 max_read_sge; > - u32 max_transfer_length; > - u32 max_inline_data_size; > - u32 max_inbound_read_limit; > - u32 max_outbound_read_limit; > - u32 max_recv_q_depth; > - u32 max_initiator_q_depth; > - u32 max_shared_recv_q_depth; > - u32 max_completion_q_depth; > - u32 inline_request_threshold; > - u32 large_request_threshold; > - u32 max_caller_data; > - u32 max_callee_data; > - u32 adapter_flags; > -} __packed; > - > -enum nd2_request_type { > - ND2_RT_RECEIVE, > - ND2_RT_SEND, > - ND2_RT_BIND, > - ND2_RT_INVALIDATE, > - ND2_RT_READ, > - ND2_RT_WRITE > -}; > - > -struct nd2_result { > - u32 status; > - u32 bytes_transferred; > - void *qp_ctx; > - void *request_ctx; > - enum nd2_request_type request_type; > -} __packed; > - > -struct nd2_sge { > - void *buffer; > - u32 buffer_length; > - u32 mr_token; > -} __packed; > - > -/* > - * The communication with the host via ioctls using VMBUS > - * as the transport. > - */ > - > -#define ND_IOCTL_VERSION 1 > - > -enum nd_mapping_type { > - ND_MAP_IOSPACE, > - ND_MAP_MEMORY, > - ND_MAP_MEMORY_COALLESCE, > - ND_MAP_PAGES, > - ND_MAP_PAGES_COALLESCE, > - ND_UNMAP_IOSPACE, > - ND_UNMAP_MEMORY, > - ND_MAX_MAP_TYPE > -}; > - > -enum nd_caching_type { > - ND_NON_CACHED = 0, > - ND_CACHED, > - ND_WRITE_COMBINED, > - ND_MAX_CACHE_TYPE > -}; > - > -enum nd_aceess_type { > - ND_READ_ACCESS = 0, > - ND_WRITE_ACCESS, > - ND_MODIFY_ACCESS > -}; > - > -struct nd_map_io_space { > - enum nd_mapping_type map_type; > - enum nd_caching_type cache_type; > - u32 cb_length; > -}; > - > -struct nd_map_memory { > - enum nd_mapping_type map_type; > - enum nd_aceess_type access_type; > - u64 address; > - u32 cb_length; > -}; > - > -struct nd_mapping_id { > - enum nd_mapping_type map_type; > - u64 id; > -}; > - > -struct ndk_map_pages { > - struct nd_map_memory header; > - u32 page_offset; > -}; > - > -union nd_mapping { > - enum nd_mapping_type map_type; > - struct nd_map_io_space map_io_space; > - struct nd_map_memory map_memory; > - struct nd_mapping_id mapping_id; > - struct ndk_map_pages map_pages; > -}; > - > -struct nd_mapping_result { > - u64 id; > - u64 info; > -}; > - > -struct nd_resource_descriptor { > - u64 handle; > - u32 ce_mapping_results; > - u32 cb_mapping_results_offset; > -}; > - > -struct nd_handle { > - u32 version; > - u32 reserved; > - u64 handle; > -}; > - > -union nd_sockaddr_inet { > - struct sockaddr_in ipv4; > - struct sockaddr_in6 ipv6; > - u16 address_family; //KYS how is this supposed to work? > -}; > - > -struct nd_address_element { > - union nd_sockaddr_inet addr; > - char mac_addr[ETH_ALEN]; > -}; > - > -struct nd_resolve_address { > - u32 version; > - u32 reserved; > - union nd_sockaddr_inet address; > -}; > - > -struct nd_open_adapter { > - u32 version; > - u32 reserved; > - u32 ce_mapping_cnt; > - u32 cb_mapping_offset; > - u64 adapter_id; > -}; > - > -struct nd_adapter_query { > - u32 version; > - u32 info_version; > - u64 adapter_handle; > -}; > - > -struct nd_create_cq { > - u32 version; > - u32 queue_depth; > - u32 ce_mapping_cnt; > - u32 cb_mapping_offset; > - u64 adapter_handle; > - struct group_affinity affinity; > -}; > - > -struct nd_create_srq { > - u32 version; > - u32 queue_depth; > - u32 ce_mapping_cnt; > - u32 cb_mapping_offset; > - u32 max_request_sge; > - u32 notify_threshold; > - u64 pd_handle; > - struct group_affinity affinity; > -}; > - > -struct nd_create_qp_hdr { > - u32 version; > - u32 cb_max_inline_data; > - u32 ce_mapping_cnt; > - u32 cb_mapping_offset; //KYS: what is this prefix - ce/cb > - u32 initiator_queue_depth; > - u32 max_initiator_request_sge; > - u64 receive_cq_handle; > - u64 initiator_cq_handle; > - u64 pd_handle; > -}; > - > -struct nd_create_qp { > - struct nd_create_qp_hdr hdr; > - u32 receive_queue_depth; > - u32 max_receive_request_sge; > -}; > - > -struct nd_create_qp_with_srq { > - struct nd_create_qp_hdr header; > - u64 srq_handle; > -}; > - > -struct nd_srq_modify { > - u32 version; > - u32 queue_depth; > - u32 ce_mapping_cnt; > - u32 cb_mapping_offset; > - u32 notify_threshold; > - u32 reserved; > - u64 srq_handle; > -}; > - > -struct nd_cq_modify { > - u32 version; > - u32 queue_depth; > - u32 ce_mapping_count; > - u32 cb_mappings_offset; > - u64 cq_handle; > -}; > - > -struct nd_cq_notify { > - u32 version; > - u32 type; > - u64 cq_handle; > -}; > - > -struct nd_mr_register_hdr { > - u32 version; > - u32 flags; > - u64 cb_length; > - u64 target_addr; > - u64 mr_handle; > -}; > - > -struct nd_mr_register { > - struct nd_mr_register_hdr header; > - u64 address; > -}; > - > -struct nd_bind { > - u32 version; > - u32 reserved; > - u64 handle; > - union nd_sockaddr_inet address; > -}; > - > -struct nd_read_limits { > - u32 inbound; > - u32 outbound; > -}; > - > -struct nd_connect { > - u32 version; > - u32 reserved; > - struct nd_read_limits read_limits; > - u32 cb_private_data_length; > - u32 cb_private_data_offset; > - u64 connector_handle; > - u64 qp_handle; > - union nd_sockaddr_inet destination_address; > - struct if_physical_addr phys_addr; > -}; > - > -struct nd_accept { > - u32 version; > - u32 reserved; > - struct nd_read_limits read_limits; > - u32 cb_private_data_length; > - u32 cb_private_data_offset; > - u64 connector_handle; > - u64 qp_handle; > -}; > - > -struct nd_reject { > - u32 version; > - u32 reserved; > - u32 cb_private_data_length; > - u32 cb_private_data_offset; > - u64 connector_handle; > -}; > - > -struct nd_listen { > - u32 version; > - u32 back_log; > - u64 listener_handle; > -}; > - > -struct nd_get_connection_request { > - u32 version; > - u32 reserved; > - u64 listener_handle; > - u64 connector_handle; > -}; > - > -enum ndv_mmio_type { > - ND_PARTITION_KERNEL_VIRTUAL, > - ND_PARTITION_SYSTEM_PHYSICAL, > - ND_PARTITION_GUEST_PHYSICAL, > - ND_MAXIMUM_MMIO_TYPE > -}; > - > -struct ndv_resolve_adapter_id { > - u32 version; > - struct if_physical_addr phys_addr; > -}; > - > -struct ndv_partition_create { > - u32 version; > - enum ndv_mmio_type mmio_type; > - u64 adapter_id; > - u64 xmit_cap; > -}; > - > -struct ndv_partition_bind_luid { > - u32 version; > - u32 reserved; > - u64 partition_handle; > - struct if_physical_addr phys_addr; > - //IF_LUID luid; //KYS? > -}; > - > -struct ndv_partition_bind_address { > - u32 version; > - u32 reserved; > - u64 partition_handle; > - union nd_sockaddr_inet address; > - struct if_physical_addr guest_phys_addr; > - struct if_physical_addr phys_addr; > -}; > - > -struct ndk_mr_register { > - struct nd_mr_register_hdr hdr; > - u32 cb_logical_page_addresses_offset; > -}; > - > -struct ndk_bind { > - struct nd_bind hdr; > - u64 authentication_id; > - bool is_admin; > -}; > - > -#define FDN 0x12 > -#define METHOD_BUFFERED 0x0 > -#define FAA 0x0 > - > -#define CTL_CODE( DeviceType, Function, Method, Access ) ( \ > - ((DeviceType) << 16) | ((Access) << 14) | ((Function) << 2) | (Method) \ > -) > - > -#define ND_FUNCTION(r_, i_) ((r_) << 6 | (i_)) > -#define IOCTL_ND(r_, i_) \ > - CTL_CODE( FDN, ND_FUNCTION((r_), (i_)), METHOD_BUFFERED, FAA ) > - > -#define ND_FUNCTION_FROM_CTL_CODE(ctrlCode_) ((ctrlCode_ >> 2) & 0xFFF) > -#define ND_RESOURCE_FROM_CTL_CODE(ctrlCode_) (ND_FUNCTION_FROM_CTL_CODE(ctrlCode_) >> 6) > -#define ND_OPERATION_FROM_CTRL_CODE(ctrlCode_) (ND_FUNCTION_FROM_CTL_CODE(ctrlCode_) & 0x3F) > - > -#define ND_DOS_DEVICE_NAME L"\\DosDevices\\Global\\NetworkDirect" > -#define ND_WIN32_DEVICE_NAME L"\\\\.\\NetworkDirect" > - > -enum nd_resource_type { > - ND_PROVIDER = 0, > - ND_ADAPTER, > - ND_PD, > - ND_CQ, > - ND_MR, > - ND_MW, > - ND_SRQ, > - ND_CONNECTOR, > - ND_LISTENER, > - ND_QP, > - ND_VIRTUAL_PARTITION, > - ND_RESOURCE_TYPE_COUNT > -}; > - > -#define ND_OPERATION_COUNT 14 > - > -#define IOCTL_ND_PROVIDER(i_) IOCTL_ND(ND_PROVIDER, i_) > -#define IOCTL_ND_ADAPTER(i_) IOCTL_ND(ND_ADAPTER, i_) > -#define IOCTL_ND_PD(i_) IOCTL_ND(ND_PD, i_) > -#define IOCTL_ND_CQ(i_) IOCTL_ND(ND_CQ, i_) > -#define IOCTL_ND_MR(i_) IOCTL_ND(ND_MR, i_) > -#define IOCTL_ND_MW(i_) IOCTL_ND(ND_MW, i_) > -#define IOCTL_ND_SRQ(i_) IOCTL_ND(ND_SRQ, i_) > -#define IOCTL_ND_CONNECTOR(i_) IOCTL_ND(ND_CONNECTOR, i_) > -#define IOCTL_ND_LISTENER(i_) IOCTL_ND(ND_LISTENER, i_) > -#define IOCTL_ND_QP(i_) IOCTL_ND(ND_QP, i_) > -#define IOCTL_ND_VIRTUAL_PARTITION(i_) IOCTL_ND(ND_VIRTUAL_PARTITION, i_) > - > -/* Provider IOCTLs */ > -#define IOCTL_ND_PROVIDER_INIT IOCTL_ND_PROVIDER( 0 ) > -#define IOCTL_ND_PROVIDER_BIND_FILE IOCTL_ND_PROVIDER( 1 ) > -#define IOCTL_ND_PROVIDER_QUERY_ADDRESS_LIST IOCTL_ND_PROVIDER( 2 ) > -#define IOCTL_ND_PROVIDER_RESOLVE_ADDRESS IOCTL_ND_PROVIDER( 3 ) > -#define IOCTL_ND_PROVIDER_MAX_OPERATION 4 > - > -/* Adapter IOCTLs */ > -#define IOCTL_ND_ADAPTER_OPEN IOCTL_ND_ADAPTER( 0 ) > -#define IOCTL_ND_ADAPTER_CLOSE IOCTL_ND_ADAPTER( 1 ) > -#define IOCTL_ND_ADAPTER_QUERY IOCTL_ND_ADAPTER( 2 ) > -#define IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST IOCTL_ND_ADAPTER( 3 ) > -#define IOCTL_ND_ADAPTER_MAX_OPERATION 4 > - > -/* Protection Domain IOCTLs */ > -#define IOCTL_ND_PD_CREATE IOCTL_ND_PD( 0 ) > -#define IOCTL_ND_PD_FREE IOCTL_ND_PD( 1 ) > -#define IOCTL_ND_PD_MAX_OPERATION 2 > - > -/* Completion Queue IOCTLs */ > -#define IOCTL_ND_CQ_CREATE IOCTL_ND_CQ( 0 ) > -#define IOCTL_ND_CQ_FREE IOCTL_ND_CQ( 1 ) > -#define IOCTL_ND_CQ_CANCEL_IO IOCTL_ND_CQ( 2 ) > -#define IOCTL_ND_CQ_GET_AFFINITY IOCTL_ND_CQ( 3 ) > -#define IOCTL_ND_CQ_MODIFY IOCTL_ND_CQ( 4 ) > -#define IOCTL_ND_CQ_NOTIFY IOCTL_ND_CQ( 5 ) > -#define IOCTL_ND_CQ_MAX_OPERATION 6 > - > -/* Memory Region IOCTLs */ > -#define IOCTL_ND_MR_CREATE IOCTL_ND_MR( 0 ) > -#define IOCTL_ND_MR_FREE IOCTL_ND_MR( 1 ) > -#define IOCTL_ND_MR_CANCEL_IO IOCTL_ND_MR( 2 ) > -#define IOCTL_ND_MR_REGISTER IOCTL_ND_MR( 3 ) > -#define IOCTL_ND_MR_DEREGISTER IOCTL_ND_MR( 4 ) > -#define IOCTL_NDK_MR_REGISTER IOCTL_ND_MR( 5 ) > -#define IOCTL_ND_MR_MAX_OPERATION 6 > - > -/* Memory Window IOCTLs */ > -#define IOCTL_ND_MW_CREATE IOCTL_ND_MW( 0 ) > -#define IOCTL_ND_MW_FREE IOCTL_ND_MW( 1 ) > -#define IOCTL_ND_MW_MAX_OPERATION 2 > - > -/* Shared Receive Queue IOCTLs */ > -#define IOCTL_ND_SRQ_CREATE IOCTL_ND_SRQ( 0 ) > -#define IOCTL_ND_SRQ_FREE IOCTL_ND_SRQ( 1 ) > -#define IOCTL_ND_SRQ_CANCEL_IO IOCTL_ND_SRQ( 2 ) > -#define IOCTL_ND_SRQ_GET_AFFINITY IOCTL_ND_SRQ( 3 ) > -#define IOCTL_ND_SRQ_MODIFY IOCTL_ND_SRQ( 4 ) > -#define IOCTL_ND_SRQ_NOTIFY IOCTL_ND_SRQ( 5 ) > -#define IOCTL_ND_SRQ_MAX_OPERATION 6 > - > -/* Connector IOCTLs */ > -#define IOCTL_ND_CONNECTOR_CREATE IOCTL_ND_CONNECTOR( 0 ) > -#define IOCTL_ND_CONNECTOR_FREE IOCTL_ND_CONNECTOR( 1 ) > -#define IOCTL_ND_CONNECTOR_CANCEL_IO IOCTL_ND_CONNECTOR( 2 ) > -#define IOCTL_ND_CONNECTOR_BIND IOCTL_ND_CONNECTOR( 3 ) > -#define IOCTL_ND_CONNECTOR_CONNECT IOCTL_ND_CONNECTOR( 4 ) > -#define IOCTL_ND_CONNECTOR_COMPLETE_CONNECT IOCTL_ND_CONNECTOR( 5 ) > -#define IOCTL_ND_CONNECTOR_ACCEPT IOCTL_ND_CONNECTOR( 6 ) > -#define IOCTL_ND_CONNECTOR_REJECT IOCTL_ND_CONNECTOR( 7 ) > -#define IOCTL_ND_CONNECTOR_GET_READ_LIMITS IOCTL_ND_CONNECTOR( 8 ) > -#define IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA IOCTL_ND_CONNECTOR( 9 ) > -#define IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS IOCTL_ND_CONNECTOR( 10 ) > -#define IOCTL_ND_CONNECTOR_GET_ADDRESS IOCTL_ND_CONNECTOR( 11 ) > -#define IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT IOCTL_ND_CONNECTOR( 12 ) > -#define IOCTL_ND_CONNECTOR_DISCONNECT IOCTL_ND_CONNECTOR( 13 ) > -#define IOCTL_ND_CONNECTOR_MAX_OPERATION 14 > - > -/* Listener IOCTLs */ > -#define IOCTL_ND_LISTENER_CREATE IOCTL_ND_LISTENER( 0 ) > -#define IOCTL_ND_LISTENER_FREE IOCTL_ND_LISTENER( 1 ) > -#define IOCTL_ND_LISTENER_CANCEL_IO IOCTL_ND_LISTENER( 2 ) > -#define IOCTL_ND_LISTENER_BIND IOCTL_ND_LISTENER( 3 ) > -#define IOCTL_ND_LISTENER_LISTEN IOCTL_ND_LISTENER( 4 ) > -#define IOCTL_ND_LISTENER_GET_ADDRESS IOCTL_ND_LISTENER( 5 ) > -#define IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST IOCTL_ND_LISTENER( 6 ) > -#define IOCTL_ND_LISTENER_MAX_OPERATION 7 > - > -/* Queue Pair IOCTLs */ > -#define IOCTL_ND_QP_CREATE IOCTL_ND_QP( 0 ) > -#define IOCTL_ND_QP_CREATE_WITH_SRQ IOCTL_ND_QP( 1 ) > -#define IOCTL_ND_QP_FREE IOCTL_ND_QP( 2 ) > -#define IOCTL_ND_QP_FLUSH IOCTL_ND_QP( 3 ) > -#define IOCTL_ND_QP_MAX_OPERATION 4 > - > -/* Kernel-mode only IOCTLs (IRP_MJ_INTERNAL_DEVICE_CONTROL) */ > -#define IOCTL_NDV_PARTITION_RESOLVE_ADAPTER_ID IOCTL_ND_VIRTUAL_PARTITION( 0 ) > -#define IOCTL_NDV_PARTITION_CREATE IOCTL_ND_VIRTUAL_PARTITION( 1 ) > -#define IOCTL_NDV_PARTITION_FREE IOCTL_ND_VIRTUAL_PARTITION( 2 ) > -#define IOCTL_NDV_PARTITION_BIND IOCTL_ND_VIRTUAL_PARTITION( 3 ) > -#define IOCTL_NDV_PARTITION_UNBIND IOCTL_ND_VIRTUAL_PARTITION( 4 ) > -#define IOCTL_NDV_PARTITION_BIND_LUID IOCTL_ND_VIRTUAL_PARTITION( 5 ) > -#define IOCTL_NDV_PARTITION_MAX_OPERATION 6 > - > - > -#define MB_SHIFT 20 > - > - > -/* Ringbuffer size for the channel */ > -#define NDV_NUM_PAGES_IN_RING_BUFFER 64 > - > -#define NDV_MAX_PACKETS_PER_RECEIVE 8 > - > -#define NDV_MAX_PACKET_COUNT 16304 > - > -#define NDV_MAX_NUM_OUTSTANDING_RECEIVED_PACKETS (16304) > -#define NDV_MAX_HANDLE_TABLE_SIZE (16304) > -#define NDV_HOST_MAX_HANDLE_TABLE_SIZE (NDV_MAX_HANDLE_TABLE_SIZE * 16) > - > - > -#define NDV_MAX_MAPPINGS 4 > - > -#define NDV_STATE_NONE 0x00000000 > -#define NDV_STATE_CREATED 0x00000001 > -#define NDV_STATE_CONNECTING 0x00000002 > -#define NDV_STATE_INITIALIZING 0x00000003 > -#define NDV_STATE_OPERATIONAL 0xEFFFFFFF > -#define NDV_STATE_FAILED 0xFFFFFFFF > - > - > -#define NDV_MAX_PRIVATE_DATA_SIZE 64 > -#define NDV_MAX_IOCTL_SIZE 256 > - > -/* max size of buffer for vector of ND_MAPPING */ > -#define NDV_MAX_MAPPING_BUFFER_SIZE \ > - (NDV_MAX_MAPPINGS * sizeof(union nd_mapping)) > - > -/* max expected ioctl buffer size from users */ > -#define NDV_MAX_IOCTL_BUFFER_SIZE \ > - (NDV_MAX_IOCTL_SIZE + \ > - NDV_MAX_MAPPING_BUFFER_SIZE + \ > - NDV_MAX_PRIVATE_DATA_SIZE) > - > -/* max PFN array for inline buffers */ > -#define NDV_MAX_INLINE_PFN_ARRAY_LENGTH 32 > - > -/* Field header size for inline buffer */ > -#define NDV_MAX_MAPPING_PACKET_FILED_BUFFER_SIZE \ > - (NDV_MAX_MAPPINGS * sizeof(NDV_PACKET_FIELD)) > - > -/* Max for a single field */ > - > -#define NDV_MAX_SINGLE_MAPPING_FIELD ( sizeof(GPA_RANGE) + \ > - (sizeof(PFN_NUMBER) * NDV_MAX_INLINE_PFN_ARRAY_LENGTH)) > - > -/* Max for all inine data */ > - > -#define NDV_MAX_MAPPING_DATA_SIZE (NDV_MAX_MAPPING_PACKET_FILED_BUFFER_SIZE + \ > - (NDV_MAX_MAPPINGS * NDV_MAX_SINGLE_MAPPING_FIELD)) > - > - > -#define NDV_MAX_PACKET_HEADER_SIZE 256 > - > -#define NDV_MAX_PACKET_SIZE (NDV_MAX_PACKET_HEADER_SIZE + \ > - NDV_MAX_IOCTL_BUFFER_SIZE + \ > - NDV_MAX_MAPPING_DATA_SIZE) > - > -/* Well known message type INIT is defined for the channel > - * not for the protocol. > - */ > - > -#define NDV_PACKET_TYPE_INIT 0xFFFFFFFF > - > -/* Invalid protocol version to to identify uninitialized channels */ > - > -#define NDV_PROTOCOL_VERSION_INVALID 0xFFFFFFFF > - > -/* Flags that control the bahavior of packet handling */ > - > -enum ndv_packet_options { > - NDV_PACKET_OPTION_NONE = 0x00, > - > - /* Indicates that the ExternalDataMdl parameter is expectected to be > - * passed and must be handled in the reciever. This call must be > - * handled specially to ensure that the MDL can be created correctly. > - */ > - NDV_PACKET_OPTION_EXTERNAL_DATA = 0x01, > - > - /* Inicates that the reciever must execution the handler at passive. */ > - NDV_PACKET_OPTIONS_REQUIRES_PASSIVE = 0x02, > - > - /* Indicates that the sender does not expect and is not waiting for a > - * response packet. > - */ > - NDV_PACKET_OPTIONS_POST = 0x04, > -}; > - > -#define NDV_PACKET_TYPE(id_, opt_) \ > - (((opt_)<<24) | (id_)) > - > -#define NDV_PACKET_TYPE_OPTIONS(type_) \ > - (((type_) >> 24) & 0xFF) > - > -#define NDV_PACKET_TYPE_ID(type_) \ > - ((type_) & 0xFFFFFF) \ > - > -#define NDV_ADD_PACKET_OPTION(type_, opt_) \ > - (type_) |= (opt_<<24) > - > -/* The header value sent on all packets */ > -union ndv_packet_hdr { > - > - struct { > - /* The type of packet. > - * This value should be created with the NDV_PACKET_TYPE macro > - * to include all packet options within the packet type. > - */ > - u32 packet_type; > - /* The size of the entire fixed message structure that exists > - * before the data. This must be >= sizeof(NDV_PACKET_HEADER) > - */ > - u32 hdr_sz; > - /* This size of the data that follows the message > - * data_sz + hdr_sz size gives the total size of > - * the buffer that is used. > - */ > - u32 data_sz; > - /* The status code used to indicate success or failure. > - * It is only used in completions and during responses. > - */ > - u32 status; //KYS: NTSTATUS? > - }; > - > - u64 padding[2]; //KYS: why? > -}; > - > - > -/* The core INIT packet. This message is defined in the channel > - * not in the protocol. This message should never change size > - * or behavior, as it could impact compatibility in the future. > - * This packet is used to negotiate the protocol version, so chaning > - * this size could break backward compat. > - */ > - > -union ndv_packet_init { > - struct { > - u32 packet_type; > - u32 protocol_version; > - u32 flags; > - }; > - u64 padding[2]; > -} __packed; > - > -#define NDV_PACKET_INIT_SIZE 16 > - > -/* Data packing flags used for accessing the dynamic fields inside a packet */ > -#define NDV_DATA_PACKING_2 0x1 > -#define NDV_DATA_PACKING_4 0x3 > -#define NDV_DATA_PACKING_8 0x7 > - > - > -#define NDV_PROTOCOL_VERSION_1 0x0100 > -#define NDV_PROTOCOL_VERSION_CURRENT NDV_PROTOCOL_VERSION_1 > -#define NDV_PROTOCOL_VERSION_COUNT 1 > - > -struct ndv_pkt_field { > - u32 size; > - u32 offset; > -}; > - > -enum ndv_pkt_id { > - NDV_PKT_UNKNOWN = 0, > - /* Version 1 Message ID's */ > - NDV_PKT_ID1_BIND, > - NDV_PKT_ID1_CREATE, > - NDV_PKT_ID1_CLEANUP, > - NDV_PKT_ID1_CANCEL, > - NDV_PKT_ID1_CONTROL, > - NDV_PKT_ID1_COMPLETE, > - NDV_PKT_ID1_INIT_RESOURCES, > -}; > - > -/* The guest will send this as the first messages just after init > - * The resources are reserved per channel. > - */ > - > -struct ndv_pkt_hdr_init_resources_1 { > - > - union ndv_packet_hdr pkt_hdr; > - u16 io_space_sz_mb; > - u64 io_space_start; > - > -}; > - > - > - > -/* The guest will send this packet to the host after channel init > - * to query support for the adapters that are registered. > - */ > - > -struct ndv_pkt_hdr_bind_1 { > - union ndv_packet_hdr pkt_hdr; > - bool unbind; > - union nd_sockaddr_inet ip_address; > - struct if_physical_addr phys_addr; > - u64 guest_id; > -}; > - > -union ndv_context_handle { > - u64 val64; > - struct { > - u32 local; > - u32 remote; > - }; > -}; > - > -struct ndv_pkt_hdr_create_1 { > - union ndv_packet_hdr pkt_hdr; > - > - /* Identifies the object used to track this file handle on both > - * the guest and the host. When sent from the guest, it will contain > - * the guest handle. On success, the host will populate and return > - * it's handle value as well. > - */ > - > - union ndv_context_handle handle; > - > - /* The parameters sent to the CreateFile call */ > - u32 access_mask; > - u32 open_options; > - > - u16 file_attributes; //KYS: This field must be 64 bit aligned > - > - u16 share_access; //KYS > - > - u32 kys_padding; //KYS > - > - u16 ea_length; //KYS; needs to be 64 bit aligned; what is ea length - unused > -}; > - > - > -struct ndv_pkt_hdr_cleanup_1 { > - union ndv_packet_hdr pkt_hdr; > - > - /* Identifies the object used to track this file handle on both > - * the guest and the host. When sent from the guest, it will contain > - * the both the guest and host handle values. The host will use this > - * value to cleanup its resource, then update its portion of the handle > - * to NDV_HANDLE_NULL before returning the data back to the guest. > - */ > - union ndv_context_handle handle; > -}; > - > -struct ndv_pkt_hdr_cancel_1 { > - union ndv_packet_hdr pkt_hdr; > - union ndv_context_handle file_handle; > - union ndv_context_handle irp_handle; > -}; > - > -struct ndv_bind_port_info { > - //LUID authentication_id; //KYS: LUID? > - bool is_admin; > -}; > - > -struct ndv_extended_data_flds { > - union { > - u32 field_count; > - u64 padding; > - }; > - > - //struct ndv_pkt_field fields[ANYSIZE_ARRAY]; //KYS? > -}; > - > - > -struct ndv_packet_hdr_control_1 { > - union ndv_packet_hdr pkt_hdr; > - /* Identifies the object used to track this file handle on both > - * the guest and the host. This should always have both guest > - * and host handle values inside it. > - */ > - > - union ndv_context_handle file_handle; > - > - /* The handle information for the allocated irp context object. > - * This information is used when the host/guest starts the cancelation > - */ > - union ndv_context_handle irp_handle; > - > - /* The input data describing in the IO control parameters */ > - > - u32 io_cntrl_code; > - u32 output_buf_sz; > - u32 input_buf_sz; > - u32 input_output_buf_offset; > - > - /* These are used in the return message to indicate the status of the IO > - * operation and the amount of data written to the output buffer. > - */ > - u32 io_status; //KYS: NTSTATUS? > - u32 bytes_returned; > - > - /* This contains the field information for additional data that is sent > - * with the packet that is IOCTL specific. > - */ > - > - struct ndv_pkt_field extended_data; > -}; > - > -/* > - * Include MLX specific defines. > - */ > - > -#include "mx_abi.h" > - > -/* Driver specific state. > - */ > - > -/* > - * We need to have host open a file; some > - * Windows constants for open. > - */ > -#define STANDARD_RIGHTS_ALL (0x001F0000L) > -#define FILE_ATTRIBUTE_NORMAL (0x80) > -#define FILE_SHARE_READ (0x00000001) > -#define FILE_SHARE_WRITE (0x00000002) > -#define FILE_SHARE_DELETE (0x00000004) > -#define FILE_FLAG_OVERLAPPED (0x40000000) > -#define FILE_SHARE_ALL (FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE) > -#define CREATE_ALWAYS (2) > -#define OPEN_EXISTING (3) > - > -#define RTL_NUMBER_OF(_x) \ > - sizeof(_x)/sizeof(_x[0]) > -/* > - * The context structure tracks the open state. > - */ > - > -/* > - * Packet layout for open adaptor. > - */ > - > -/* > - * Packet for querying the address list. > - */ > - > -union query_addr_list_ioctl { > - struct nd_handle in; > - union nd_sockaddr_inet out[16]; //KYS a max of 16 addresses > -}; > - > -struct pkt_query_addr_list { > - struct ndv_packet_hdr_control_1 hdr; > - union query_addr_list_ioctl ioctl; > - unsigned long activity_id; > -}; > - > - > -struct pkt_fld { > - u32 size; > - u32 offset; > -}; > - > -struct fld_data { > - union { > - u64 padding; > - }; > -}; > - > -struct extended_data_oad { > - union { > - u32 cnt; > - u64 padding; > - }; > - /* offsets are from start of extended data struct > - * and should start on 8 byte boundary > - */ > - struct pkt_fld fields[IBV_GET_CONTEXT_MAPPING_MAX]; > -}; > - > -union oad_ioctl { > - struct nd_open_adapter input; > - struct nd_resource_descriptor resrc_desc; > -}; > - > -union oad_mappings { > - struct ibv_get_context_req ctx_input; > - struct ibv_get_context_resp ctx_output; > -}; > - > -struct pkt_nd_open_adapter { > - struct ndv_packet_hdr_control_1 hdr; > - > - union oad_ioctl ioctl; > - union oad_mappings mappings; > - > - /* > - * Extended data. > - */ > - struct extended_data_oad ext_data; > -}; > - > -/* > - * Create CQ IOCTL. > - */ > - > -struct cq_db_gpa { > - u32 byte_count; > - u32 byte_offset; > - u64 pfn_array[2]; > -}; > - > -struct cq_sn_gpa { > - u32 byte_count; > - u32 byte_offset; > - u64 pfn_array[2]; > -}; > - > -struct create_cq_ext_data { > - union { > - u32 cnt; > - u64 padding; > - }; > - /* offsets are from start of extended data struct > - * and should start on 8 byte boundary > - */ > - struct pkt_fld fields[MLX4_IB_CREATE_CQ_MAPPING_MAX]; > - struct cq_db_gpa db_gpa; > - struct cq_sn_gpa sn_gpa; > - struct gpa_range cqbuf_gpa; > -}; > - > -union create_cq_ioctl { > - struct nd_create_cq input; > - struct nd_resource_descriptor resrc_desc; > -}; > - > -union create_cq_mappings { > - struct ibv_create_cq cq_in; > - struct ibv_create_cq_resp cq_resp; > -}; > - > -struct pkt_nd_create_cq { > - struct ndv_packet_hdr_control_1 hdr; > - > - union create_cq_ioctl ioctl; > - union create_cq_mappings mappings; > - > - /* > - * Extended data. > - */ > - struct create_cq_ext_data ext_data; > -}; > - > -/* > - * IOCTL to free CQ. > - */ > -struct free_cq_ioctl { > - struct nd_handle in; > -}; > - > -struct pkt_nd_free_cq { > - struct ndv_packet_hdr_control_1 hdr; > - > - struct free_cq_ioctl ioctl; > -}; > - > - > -/* > - * IOCTL to QUERY CQ - CQ NOTIFY > - */ > - > -struct notify_cq_ioctl { > - struct nd_cq_notify in; > -}; > - > -struct pkt_nd_notify_cq { > - struct ndv_packet_hdr_control_1 hdr; > - struct notify_cq_ioctl ioctl; > -}; > - > -/* > - * IOCTL to Create a listner > - */ > - > -struct nd_ep_create { > - struct nd_handle hdr; > - bool to_semantics; > - unsigned long activity_id; > -}; > - > -union listener_cr_ioctl { > - struct nd_ep_create in; > - u64 out; > -}; > - > -struct pkt_nd_cr_listener { > - struct ndv_packet_hdr_control_1 hdr; > - union listener_cr_ioctl ioctl; > -}; > - > -/* > - * IOCTL to free listener. > - */ > - > -struct listener_free_ioctl { > - struct nd_handle in; > -}; > - > -struct pkt_nd_free_listener { > - struct ndv_packet_hdr_control_1 hdr; > - struct listener_free_ioctl ioctl; > -}; > - > -/* > - * IOCTL for listener cancel IO. > - */ > -struct listener_cancelio_ioctl { > - struct nd_handle in; > -}; > - > -struct pkt_nd_cancelio_listener { > - struct ndv_packet_hdr_control_1 hdr; > - struct listener_cancelio_ioctl ioctl; > -}; > - > -/* > - * IOCTL for LISTENER BIND > - */ > - > -union listener_bind_ioctl { > - struct ndk_bind in; > -}; > - > -struct pkt_nd_bind_listener { > - struct ndv_packet_hdr_control_1 hdr; > - union listener_bind_ioctl ioctl; > -}; > - > -/* > - * After the listener is bound, enable > - * listening. > - */ > - > -union listener_listen_ioctl { > - struct nd_listen in; > -}; > - > -struct pkt_nd_listen_listener { > - struct ndv_packet_hdr_control_1 hdr; > - union listener_listen_ioctl ioctl; > -}; > - > -/* > - * IOCTL for getting the adddress from listener. > - * > - */ > - > -union listener_get_addr_ioctl { > - struct nd_handle in; > - union nd_sockaddr_inet out; > -}; > - > -struct pkt_nd_get_addr_listener { > - struct ndv_packet_hdr_control_1 hdr; > - union listener_get_addr_ioctl ioctl; > -}; > - > -/* > - * IOCTL to get a connection from a listener. > - */ > - > -union listener_get_connection_ioctl { > - struct nd_get_connection_request in; > - union nd_sockaddr_inet out; > -}; > - > -struct pkt_nd_get_connection_listener { > - struct ndv_packet_hdr_control_1 hdr; > - union listener_get_connection_ioctl ioctl; > -}; > - > - > -/* > - * Connector IOCTLs > - */ > - > -/* > - * IOCTL to create connector. > - */ > - > -union connector_cr_ioctl { //KYS should this be a union or struct? > - struct nd_ep_create in; > - u64 out; > -}; > - > -struct pkt_nd_cr_connector { > - struct ndv_packet_hdr_control_1 hdr; > - union connector_cr_ioctl ioctl; //KYS: union or struct > -}; > - > -/* > - * IOCTL to free connector. > - */ > - > -struct connector_free_ioctl { > - struct nd_handle in; > -}; > - > -struct pkt_nd_free_connector { > - struct ndv_packet_hdr_control_1 hdr; > - struct connector_free_ioctl ioctl; > -}; > - > -/* > - * IOCTL to cancel I/O on a connector. > - */ > - > -struct connector_cancelio_ioctl { > - struct nd_handle in; > -}; > - > -struct pkt_nd_cancelio_connector { > - struct ndv_packet_hdr_control_1 hdr; > - struct connector_cancelio_ioctl ioctl; > -}; > - > -/* > - * IOCTL to Bind an address to the connector. > - */ > - > -union connector_bind_ioctl { > - struct ndk_bind in; > -}; > - > -struct pkt_nd_bind_connector { > - struct ndv_packet_hdr_control_1 hdr; > - union connector_bind_ioctl ioctl; > -}; > - > -/* > - * IOCTL to connect a connector. > - */ > - > -struct connector_connect_in { > - struct nd_connect hdr; > - u8 retry_cnt; > - u8 rnr_retry_cnt; > - u8 priv_data[56]; > - unsigned long activity_id; > -}; > - > -union connector_connect_ioctl { > - struct connector_connect_in in; > -}; > - > -struct pkt_nd_connector_connect { > - struct ndv_packet_hdr_control_1 hdr; > - union connector_connect_ioctl ioctl; > -}; > - > -/* > - * IOCTL for connector complete connect > - */ > - > -struct complete_connect_in { > - struct nd_handle hdr; > - u8 rnr_nak_to; > - unsigned long activity_id; > -}; > - > -struct complete_connect_out { > - enum ibv_qp_state state; > -}; > - > -union connector_complete_connect_ioctl { > - struct complete_connect_in in; > - struct complete_connect_out out; > -}; > - > -struct pkt_nd_connector_connect_complete { > - struct ndv_packet_hdr_control_1 hdr; > - union connector_complete_connect_ioctl ioctl; > -}; > - > - > -#define MAX_PRIVATE_DATA_LEN 148 > - > -/* > - * IOCTL for connector accept. > - */ > - > -struct connector_accept_in { > - struct nd_accept hdr; > - u8 rnr_retry_cnt; > - u8 rnr_nak_to; > - u8 private_data[MAX_PRIVATE_DATA_LEN]; > - unsigned long activity_id; > -}; > - > -struct connector_accept_out { > - enum ibv_qp_state state; > -}; > - > -union connector_accept_ioctl { > - struct connector_accept_in in; > - struct connector_accept_out out; > -}; > - > -struct pkt_nd_connector_accept { > - struct ndv_packet_hdr_control_1 hdr; > - union connector_accept_ioctl ioctl; > -}; > - > -/* > - * IOCTL for connector to reject a connection. > - */ > - > -struct connector_reject_in { > - struct nd_reject hdr; > - u8 private_data[MAX_PRIVATE_DATA_LEN]; > -}; > - > -struct connector_reject_out { > - enum ibv_qp_state state; > -}; > - > -union connector_reject_ioctl { > - struct connector_reject_in in; > - struct connector_reject_out out; > -}; > - > -struct pkt_nd_connector_reject { > - struct ndv_packet_hdr_control_1 hdr; > - union connector_reject_ioctl ioctl; > -}; > - > -/* > - * IOCTL to get connector read limits. > - */ > - > -struct connector_get_rd_limits_in { > - struct nd_handle in; > -}; > - > -struct connector_get_rd_limits_out { > - struct nd_read_limits out; > -}; > - > -union connector_get_rd_limits_ioctl { > - struct connector_get_rd_limits_in in; > - struct connector_get_rd_limits_out out; > -}; > - > -struct pkt_nd_connector_get_rd_limits { > - struct ndv_packet_hdr_control_1 hdr; > - union connector_get_rd_limits_ioctl ioctl; > -}; > - > -/* > - * IOCTL to get connector private data. > - */ > -union connector_get_priv_data_ioctl { > - struct nd_handle in; > - u8 out[MAX_PRIVATE_DATA_LEN]; > -}; > - > -struct pkt_nd_connector_get_priv_data { > - struct ndv_packet_hdr_control_1 hdr; > - union connector_get_priv_data_ioctl ioctl; > -}; > - > - > -/* > - * IOCTL get peer address. > - */ > - > -union connector_get_peer_addr_ioctl { > - struct nd_handle in; > - union nd_sockaddr_inet out; > -}; > - > -struct pkt_nd_connector_get_peer_addr { > - struct ndv_packet_hdr_control_1 hdr; > - union connector_get_peer_addr_ioctl ioctl; > -}; > - > -/* > - * IOCTL to get connector address. > - */ > - > -union connector_get_addr_ioctl { > - struct nd_handle in; > - union nd_sockaddr_inet out; > -}; > - > -struct pkt_nd_connector_get_addr { > - struct ndv_packet_hdr_control_1 hdr; > - union connector_get_addr_ioctl ioctl; > -}; > - > -/* > - * IOCTL for disconnect notification. > - */ > - > -union connector_notify_disconnect_ioctl { > - struct nd_handle in; > -}; > - > -struct pkt_nd_connector_notify_disconnect { > - struct ndv_packet_hdr_control_1 hdr; > - union connector_notify_disconnect_ioctl ioctl; > -}; > - > -union connector_disconnect_ioctl { > - struct nd_handle in; > -}; > - > -struct pkt_nd_connector_disconnect { > - struct ndv_packet_hdr_control_1 hdr; > - union connector_notify_disconnect_ioctl ioctl; > -}; > - > -/* > - * IOCTLs for QP operations. > - */ > - > -/* > - * Create qp IOCTL. > - */ > - > -struct qp_db_gpa { > - u32 byte_count; > - u32 byte_offset; > - u64 pfn_array[1]; > -}; > - > -struct create_qp_ext_data { > - union { > - u32 cnt; > - u64 padding; > - }; > - /* offsets are from start of extended data struct > - * and should start on 8 byte boundary > - */ > - struct pkt_fld fields[MLX4_IB_CREATE_QP_MAPPINGS_MAX]; > - struct qp_db_gpa db_gpa; > - struct gpa_range qpbuf_gpa; > -}; > - > -union create_qp_ioctl { > - struct nd_create_qp input; > - struct nd_resource_descriptor resrc_desc; > -}; > - > -union create_qp_mappings { > - struct ibv_create_qp qp_in; > - struct ibv_create_qp_resp qp_resp; > -}; > - > -struct pkt_nd_create_qp { > - struct ndv_packet_hdr_control_1 hdr; > - > - union create_qp_ioctl ioctl; > - union create_qp_mappings mappings; > - > - /* > - * Extended data. > - */ > - struct create_qp_ext_data ext_data; > -}; > - > -/* > - * IOCTL to flush a QP. > - */ > -struct flush_qp_ioctl { > - struct nd_handle in; > - enum ibv_qp_state out; > -}; > - > -struct pkt_nd_flush_qp { > - struct ndv_packet_hdr_control_1 hdr; > - struct flush_qp_ioctl ioctl; > -}; > - > -/* > - * Memory Region IOCTLS > - */ > -union create_mr_ioctl { > - struct nd_handle in; > - u64 out; > -}; > - > -struct pkt_nd_create_mr { > - struct ndv_packet_hdr_control_1 hdr; > - union create_mr_ioctl ioctl; > -}; > - > -struct mr_out { > - u32 lkey; > - u32 rkey; > - unsigned long activity_id; > -}; > - > - > -union register_mr_ioctl { > - struct nd_mr_register in; > - struct mr_out out; > -}; > - > -struct pkt_nd_register_mr { > - struct ndv_packet_hdr_control_1 hdr; > - union register_mr_ioctl ioctl; > -}; > - > -struct deregister_mr_ioctl { > - struct nd_handle in; > -}; > - > -struct pkt_nd_deregister_mr { > - struct ndv_packet_hdr_control_1 hdr; > - struct deregister_mr_ioctl ioctl; > -}; > - > -/* > - * IOCTL to disconnect connector > - */ > - > -/* > - * Create PD IOCTL. > - */ > -struct nd_create_pd_ioctl { > - union { > - struct nd_handle in; > - u64 out_handle; > - }; > - struct ibv_alloc_pd_resp resp; > -}; > - > -struct pkt_nd_pd_create { > - struct ndv_packet_hdr_control_1 hdr; > - struct nd_create_pd_ioctl ioctl; > -}; > - > -/* > - * Free Handle. Check the layout with Luke. > - * > - */ > -struct free_handle_ioctl { > - struct nd_handle in; > -}; > - > -struct pkt_nd_free_handle { > - struct ndv_packet_hdr_control_1 hdr; > - struct free_handle_ioctl ioctl; > -}; > - > -/* > - * Cancel I/O. > - */ > - > -struct cancel_io_ioctl { > - struct nd_handle in; > -}; > - > -struct pkt_nd_cancel_io { > - struct ndv_packet_hdr_control_1 hdr; > - struct cancel_io_ioctl ioctl; > -}; > - > -/* > - * Connector states: > - */ > - > -enum connector_state { > - HVND_CON_INCOMING, > - HVND_CON_INCOMING_ESTABLISHED, > - HVND_CON_INCOMING_REJECTED, > - HVND_CON_OUTGOING_REQUEST > -}; > - > - > -/* > - * Adaptor query IOCTL. > - */ > -struct nd_adap_query_ioctl { > - union { > - struct nd_adapter_query ad_q; > - struct adapter_info_v2 ad_info; > - }; > -}; > - > -struct pkt_nd_query_adaptor { > - struct ndv_packet_hdr_control_1 hdr; > - struct nd_adap_query_ioctl ioctl; > -}; > - > -struct nd_ioctl { > - union { > - struct nd_handle handle; > - u8 raw_buffer[NDV_MAX_IOCTL_BUFFER_SIZE]; > - }; > -}; > - > -struct pkt_nd_provider_ioctl { > - struct ndv_packet_hdr_control_1 hdr; > - struct nd_ioctl ioctl; > -}; > - > -struct hvnd_ib_pd { > - struct ib_pd ibpd; > - u32 pdn; > - u64 handle; > -}; > - > -struct hvnd_work { > - struct work_struct work; > - void *callback_arg; > -}; > - > -struct hvnd_disconnect_work { > - struct work_struct work; > - int status; > - void *callback_arg; > -}; > - > -/* > -struct hvnd_delayed_work { > - struct delayed_work work; > - void *callback_arg; > -}; > -*/ > - > -enum hvnd_cm_state { > - hvnd_cm_idle = 0, > - hvnd_cm_connect_reply_sent, //active > - hvnd_cm_connect_reply_refused, > - hvnd_cm_connect_received, //active > - hvnd_cm_connect_request_sent, //passive > - hvnd_cm_accept_sent, > - hvnd_cm_close_sent, > - hvnd_cm_established_sent, > -}; > - > -struct incoming_pkt { > - struct list_head list_entry; > - char pkt[0]; > -}; > - > -struct hvnd_ep_obj { > -/* > - spinlock_t ep_lk; > - bool to_be_destroyed; > - bool io_outstanding; > - > - wait_queue_head_t wait; > - bool stopped; > - atomic_t process_refcnt; // how many NDV_PKT_ID1_COMPLETE packets we are currently processing > -*/ > - bool stopping; > - wait_queue_head_t wait_pending; > - atomic_t nr_requests_pending; > - > - enum nd_resource_type type; > - enum connector_state state; //KYS need to look at locking > - struct iw_cm_id *cm_id; > - enum hvnd_cm_state cm_state; > - struct completion block_event; > - struct completion disconnect_event; > - struct completion connector_accept_event; > - int connector_accept_status; > - u64 ep_handle; > - spinlock_t incoming_pkt_list_lock; > - struct list_head incoming_pkt_list; > - struct hvnd_ep_obj *parent; > - struct hvnd_dev *nd_dev; > - struct hvnd_ucontext *uctx; > - struct hvnd_work wrk; > - struct hvnd_cq *cq; > - u8 ord; > - u8 ird; > - char priv_data[MAX_PRIVATE_DATA_LEN]; > - bool incoming; > - atomic_t disconnect_notified; > - u64 outstanding_handle; > - u32 local_irp; > - struct hvnd_ep_obj *outstanding_ep; > - struct pkt_nd_connector_connect connector_connect_pkt; > - int connector_connect_retry; > -}; > - > -struct hvnd_ucontext { > - struct ib_ucontext ibucontext; > - struct list_head listentry; > - struct ndv_pkt_hdr_create_1 create_pkt; > - struct ndv_pkt_hdr_create_1 create_pkt_ovl; /* Overlap handle */ > - struct pkt_nd_provider_ioctl pr_init_pkt; > - union ndv_context_handle file_handle; > - union ndv_context_handle file_handle_ovl; > - > - struct pkt_nd_open_adapter o_adap_pkt; > - > - u64 adaptor_hdl; > - > - /* > - * Protection domain state. > - */ > - struct pkt_nd_pd_create pd_cr_pkt; > - > - u64 uar_base; > - u64 bf_base; > - u32 bf_buf_size; > - u32 bf_offset; > - u32 cqe_size; > - u32 max_qp_wr; > - u32 max_sge; > - u32 max_cqe; > - u32 num_qps; > - > - /* > - * State to manage dorbell pages: > - */ > - struct list_head db_page_list; > - struct mutex db_page_mutex; > - > - atomic_t refcnt; > - > -}; > - > -struct hvnd_dev { > - struct ib_device ibdev; > - struct hv_device *hvdev; > - u32 device_cap_flags; > - unsigned char nports; > - bool ib_active; > - > - /* State to manage interaction with the host. > - */ > - > - spinlock_t uctxt_lk; > - struct list_head listentry; > - > - unsigned long mmio_sz; > - unsigned long mmio_start_addr; > - struct resource mmio_resource; > - void *mmio_virt; > - > - unsigned long negotiated_version; > - union ndv_packet_init init_pkt; > - struct ndv_pkt_hdr_init_resources_1 resources; > - struct ndv_pkt_hdr_bind_1 bind_pkt; > - > - struct ndv_pkt_hdr_create_1 global_create_pkt; > - union ndv_context_handle global_file_handle; > - > - struct semaphore query_pkt_sem; > - bool query_pkt_set; > - struct pkt_nd_query_adaptor query_pkt; > - > - /* > - * ID tables. > - */ > - spinlock_t id_lock; > - > - struct idr cqidr; > - struct idr qpidr; > - struct idr mmidr; > - struct idr irpidr; > - struct idr uctxidr; > - atomic_t open_cnt; > - > - char ip_addr[4]; > - char mac_addr[6]; > - struct completion addr_set; > - int bind_complete; > - struct mutex bind_mutex; > -}; > - > -struct hvnd_cq { > - struct ib_cq ibcq; > - void *cq_buf; > - void *db_addr; > - u32 arm_sn; > - u32 entries; > - > - u32 cqn; > - u32 cqe; > - u64 cq_handle; > - > - struct ib_umem *umem; > - struct ib_umem *db_umem; > - struct mlx4_ib_user_db_page user_db_page; > - struct hvnd_ucontext *uctx; > - struct hvnd_ep_obj ep_object; //KYS need to clean this up; have a cq irp state > - bool monitor; > - bool upcall_pending; > -}; > - > -struct hvnd_qp { > - struct ib_qp ibqp; > - void *qp_buf; > - void *db_addr; > - u32 buf_size; > - u8 port; > - struct hvnd_dev *nd_dev; > - > - __u8 log_sq_bb_count; > - __u8 log_sq_stride; > - __u8 sq_no_prefetch; > - > - int rq_wqe_cnt; > - int rq_wqe_shift; > - int rq_max_gs; > - > - int sq_wqe_cnt; > - int sq_wqe_shift; > - int sq_max_gs; > - > - u32 max_inline_data; > - > - u32 initiator_q_depth; > - u32 initiator_request_sge; > - > - u32 receive_q_depth; > - u32 receive_request_sge; > - > - struct hvnd_cq *recv_cq; > - struct hvnd_cq *send_cq; > - > - u64 receive_cq_handle; > - u64 initiator_cq_handle; > - u64 pd_handle; > - > - u64 qp_handle; > - u32 qpn; > - u32 max_send_wr; > - u32 max_recv_wr; > - u32 max_send_sge; > - u32 max_recv_sge; > - > - struct ib_umem *umem; > - struct ib_umem *db_umem; > - struct mlx4_ib_user_db_page user_db_page; > - struct hvnd_ucontext *uctx; > - struct iw_cm_id *cm_id; > - > - /* > - * Current QP state; need to look at locking. > - * XXXKYS > - */ > - enum ib_qp_state qp_state; > - bool cq_notify; > - wait_queue_head_t wait; > - atomic_t refcnt; > - struct hvnd_ep_obj *connector; > -}; > - > -struct hvnd_mr { > - struct ib_mr ibmr; > - struct hvnd_ib_pd *pd; > - struct ib_umem *umem; > - u64 start; > - u64 length; > - u64 virt; > - int acc; > - u64 mr_handle; > - u32 mr_lkey; > - u32 mr_rkey; > -}; > - > -struct hvnd_cookie { > - struct completion host_event; > - void *pkt; > -}; > - > -/* > - * Definitions to retrieve the IP address. > - */ > - > -#define HVND_CURRENT_VERSION 0 > - > -struct hvnd_ipaddr_tuple { > - char mac_address[ETH_ALEN]; > - struct sockaddr addr; > -}; > - > -struct hvnd_msg { > - int status; > - struct hvnd_ipaddr_tuple ip_tuple; > -}; > - > -static inline struct hvnd_ib_pd *to_nd_pd(struct ib_pd *pd) > -{ > - return container_of(pd, struct hvnd_ib_pd, ibpd); > -} > - > -static inline struct hvnd_dev *to_nd_dev(struct ib_device *ibdev) > -{ > - return container_of(ibdev, struct hvnd_dev, ibdev); > -} > - > -static inline struct hvnd_cq *to_nd_cq(struct ib_cq *ibcq) > -{ > - return container_of(ibcq, struct hvnd_cq, ibcq); > -} > - > -static inline struct hvnd_qp *to_nd_qp(struct ib_qp *ibqp) > -{ > - return container_of(ibqp, struct hvnd_qp, ibqp); > -} > - > -static inline struct hvnd_ucontext *to_nd_context(struct ib_ucontext *ibucontext) > -{ > - return container_of(ibucontext, struct hvnd_ucontext, ibucontext); > -} > - > -static inline struct hvnd_ucontext *get_uctx_from_pd(struct ib_pd *pd) > -{ > - return to_nd_context(pd->uobject->context); > -} > - > -static inline struct hvnd_mr *to_nd_mr(struct ib_mr *ibmr) > -{ > - return container_of(ibmr, struct hvnd_mr, ibmr); > -} > -/* > - * ID management. > - */ > - > -static inline int insert_handle(struct hvnd_dev *dev, struct idr *idr, > - void *handle, u32 id) > -{ > - int ret; > - unsigned long flags; > - > - idr_preload(GFP_KERNEL); > - spin_lock_irqsave(&dev->id_lock, flags); > - > - ret = idr_alloc(idr, handle, id, id + 1, GFP_ATOMIC); > - > - spin_unlock_irqrestore(&dev->id_lock, flags); > - idr_preload_end(); > - > - BUG_ON(ret == -ENOSPC); > - return ret < 0 ? ret : 0; > -} > - > -static inline void remove_handle(struct hvnd_dev *dev, struct idr *idr, u32 id) > -{ > - unsigned long flags; > - > - spin_lock_irqsave(&dev->id_lock, flags); > - idr_remove(idr, id); > - spin_unlock_irqrestore(&dev->id_lock, flags); > -} > - > -static inline struct hvnd_cq *get_cqp(struct hvnd_dev *dev, u32 cqid) > -{ > - struct hvnd_cq *cqp; > - unsigned long flags; > - > - spin_lock_irqsave(&dev->id_lock, flags); > - cqp = idr_find(&dev->cqidr, cqid); > - spin_unlock_irqrestore(&dev->id_lock, flags); > - > - return cqp; > -} > - > -static inline struct hvnd_qp *get_qpp(struct hvnd_dev *dev, u32 qpid) > -{ > - struct hvnd_qp *qpp; > - unsigned long flags; > - > - spin_lock_irqsave(&dev->id_lock, flags); > - qpp = idr_find(&dev->qpidr, qpid); > - spin_unlock_irqrestore(&dev->id_lock, flags); > - > - return qpp; > -} > - > -static inline struct hvnd_ucontext *get_uctx(struct hvnd_dev *dev, u32 pid) > -{ > - struct hvnd_ucontext *uctx; > - unsigned long flags; > - > - spin_lock_irqsave(&dev->id_lock, flags); > - uctx = idr_find(&dev->uctxidr, pid); > - spin_unlock_irqrestore(&dev->id_lock, flags); > - > - return uctx; > -} > - > - > -static inline void *map_irp_to_ctx(struct hvnd_dev *nd_dev, u32 irp) > -{ > - void *ctx; > - unsigned long flags; > - > - spin_lock_irqsave(&nd_dev->id_lock, flags); > - ctx = idr_find(&nd_dev->irpidr, irp); > - spin_unlock_irqrestore(&nd_dev->id_lock, flags); > - > - return ctx; > -} > - > - > - > -void hvnd_callback(void *context); > -int hvnd_negotiate_version(struct hvnd_dev *nd_dev); > -int hvnd_init_resources(struct hvnd_dev *nd_dev); > -int hvnd_bind_nic(struct hvnd_dev *nd_dev, bool un_bind, char *ip_addr, char *mac_addr); > -int hvnd_open_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx); > -int hvnd_close_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx); > -int hvnd_query_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx); > -int hvnd_create_pd(struct hvnd_ucontext *uctx, struct hvnd_dev *nd_dev, > - struct hvnd_ib_pd *hvnd_pd); > - > -/* > - * CQ operations. > - */ > -int hvnd_create_cq(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - struct hvnd_cq *cq); > -int hvnd_destroy_cq(struct hvnd_dev *nd_dev, struct hvnd_cq *cq); > -int hvnd_notify_cq(struct hvnd_dev *nd_dev, struct hvnd_cq *cq, > - u32 notify_type, u64 irp_handle); > - > -/* > - * QP operations. > - */ > -int hvnd_create_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - struct hvnd_qp *qp); > - > -int hvnd_free_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - struct hvnd_qp *qp); > - > -int hvnd_flush_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - struct hvnd_qp *qp); > - > -/* > - * MR operations. > - */ > - > -int hvnd_cr_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 pd_handle, u64 *mr_handle); > - > -int hvnd_free_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 handle); > - > -int hvnd_mr_register(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - struct hvnd_mr *mr); > -int hvnd_deregister_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 handle); > - > -/* > - * Listner operations > - */ > -int hvnd_cr_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, u64 *handle); > - > - > -int hvnd_free_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 listener_handle); > - > -int hvnd_bind_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 listener_handle, union nd_sockaddr_inet *addr); > - > -int hvnd_listen_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 listener_handle, u32 backlog); > - > -int hvnd_get_addr_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 listener_handle, union nd_sockaddr_inet *addr); > - > -int hvnd_get_connection_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 listener_handle, u64 connector_handle, > - u64 irp_handle); > - > -/* > - * Connector operations. > - */ > -int hvnd_cr_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 *connector_handle); > - > -int hvnd_free_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 handle); > - > -int hvnd_cancelio_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 handle); > -int hvnd_bind_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 handle, union nd_sockaddr_inet *addr); > - > -int hvnd_connector_connect(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 connector_handle, u32 in_rd_limit, u32 out_rd_limit, > - u32 priv_data_length, const u8 *priv_data, > - u64 qp_handle, struct if_physical_addr *phys_addr, > - union nd_sockaddr_inet *dest_addr, struct hvnd_ep_obj *ep); > - > -int hvnd_connector_complete_connect(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 connector_handle, enum ibv_qp_state *qp_state); > - > -int hvnd_connector_accept(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 connector_handle, > - u64 qp_handle, > - u32 in_rd_limit, u32 out_rd_limit, > - u32 priv_data_length, const u8 *priv_data, > - enum ibv_qp_state *qp_state, struct hvnd_ep_obj *ep); > - > -int hvnd_connector_reject(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 connector_handle, > - u32 priv_data_length, u8 *priv_data, > - enum ibv_qp_state *qp_state); > - > -int hvnd_connector_get_rd_limits(struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx, > - u64 connector_handle, > - struct nd_read_limits *rd_limits); > - > -int hvnd_connector_get_priv_data(struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx, > - u64 connector_handle, > - u8 *priv_data); > - > -int hvnd_connector_get_peer_addr(struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx, > - u64 connector_handle, > - union nd_sockaddr_inet *peer_addr); > - > -int hvnd_connector_get_local_addr(struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx, > - u64 connector_handle, > - union nd_sockaddr_inet *local_addr); > - > -int hvnd_connector_notify_disconnect(struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx, > - u64 connector_handle, struct hvnd_ep_obj *ep); > - > - > -int hvnd_connector_disconnect(struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx, > - u64 connector_handle, struct hvnd_ep_obj *ep); > - > -int hvnd_free_handle(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 handle, u32 ioctl); > - > -int hvnd_cancel_io(struct hvnd_ep_obj *ep_object); > - > -char *hvnd_get_op_name(int ioctl); > - > -void hvnd_acquire_uctx_ref(struct hvnd_ucontext *uctx); > -void hvnd_drop_uctx_ref(struct hvnd_dev *nd_dev,struct hvnd_ucontext *uctx); > -void hvnd_process_events(struct work_struct *work); > - > -void hvnd_process_cq_event_pending(struct hvnd_ep_obj *ep, int status); > -void hvnd_process_cq_event_complete(struct hvnd_ep_obj *ep, int status); > -void hvnd_process_connector_accept(struct hvnd_ep_obj *ep_object, int status); > -void hvnd_process_notify_disconnect(struct hvnd_ep_obj *ep_object, int status); > -void hvnd_process_disconnect(struct hvnd_ep_obj *ep_object, int status); > - > -void put_irp_handle(struct hvnd_dev *nd_dev, u32 irp); > -int get_irp_handle(struct hvnd_dev *nd_dev, u32 *local, void *irp_ctx); > - > -void hvnd_init_hdr(struct ndv_packet_hdr_control_1 *hdr, > - u32 data_sz, u32 local, u32 remote, > - u32 ioctl_code, > - u32 ext_data_sz, u32 ext_data_offset, > - u64 irp_handle); > - > -int hvnd_send_ioctl_pkt(struct hvnd_dev *nd_dev, > - struct ndv_packet_hdr_control_1 *hdr, > - u32 pkt_size, u64 cookie); > - > -int hvnd_get_outgoing_rdma_addr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - union nd_sockaddr_inet *og_addr); > - > -int hvnd_get_neigh_mac_addr(struct sockaddr *local, struct sockaddr *remote, char *mac_addr); > - > -void hvnd_addr_init(void); > - > -void hvnd_addr_deinit(void); > - > -bool ep_add_work_pending(struct hvnd_ep_obj *ep_object); > -void ep_del_work_pending(struct hvnd_ep_obj *ep_object); > -void ep_stop(struct hvnd_ep_obj *ep_object); > - > -#define current_pid() (current->pid) > -/* > - * NT STATUS defines. > - */ > - > -#define STATUS_SUCCESS 0x0 > -#define STATUS_PENDING 0x00000103 > -#define STATUS_CANCELLED 0xC0000120 > -#define STATUS_DISCONNECTED 0xC000020C > -#define STATUS_TIMEOUT 0xC00000B5 > - > -void inc_ioctl_counter_request(unsigned ioctl); > -void inc_ioctl_counter_response(unsigned ioctl); > - > -#define NDV_PROTOCOL_VAERSION_INVALID -1 > -#define NDV_PACKET_INIT_SIZE 16 /* Size of the INIT packet */ > - > -#define HVND_RING_SZ (PAGE_SIZE * 64) > - > -/* logging levels */ > -#define HVND_ERROR 0 > -#define HVND_WARN 1 > -#define HVND_INFO 2 > -#define HVND_DEBUG 3 > - > -extern int hvnd_log_level; > - > -#define hvnd_error(fmt, args...) hvnd_log(HVND_ERROR, fmt, ##args) > -#define hvnd_warn(fmt, args...) hvnd_log(HVND_WARN, fmt, ##args) > -#define hvnd_info(fmt, args...) hvnd_log(HVND_INFO, fmt, ##args) > -#define hvnd_debug(fmt, args...) hvnd_log(HVND_DEBUG, fmt, ##args) > - > -#define hvnd_log(level, fmt, args...) \ > -do { \ > - if (unlikely(hvnd_log_level >= (level))) \ > - printk(KERN_ERR "hvnd %s[%u]: " fmt, __func__, __LINE__, ##args); \ > -} while (0) > - > -#endif /* _VMBUS_RDMA_H */ > diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/COPYING b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/COPYING > new file mode 100644 > index 000000000000..f0960c946b07 > --- /dev/null > +++ b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/COPYING > @@ -0,0 +1,339 @@ > + > +"This software program is licensed subject to the GNU General Public License > +(GPL). Version 2, June 1991, available at > +<http://www.fsf.org/copyleft/gpl.html>" > + > +GNU General Public License > + > +Version 2, June 1991 > + > +Copyright (C) 1989, 1991 Free Software Foundation, Inc. > +51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA > + > +Everyone is permitted to copy and distribute verbatim copies of this license > +document, but changing it is not allowed. > + > +Preamble > + > +The licenses for most software are designed to take away your freedom to > +share and change it. By contrast, the GNU General Public License is intended > +to guarantee your freedom to share and change free software--to make sure > +the software is free for all its users. This General Public License applies > +to most of the Free Software Foundation's software and to any other program > +whose authors commit to using it. (Some other Free Software Foundation > +software is covered by the GNU Library General Public License instead.) You > +can apply it to your programs, too. > + > +When we speak of free software, we are referring to freedom, not price. Our > +General Public Licenses are designed to make sure that you have the freedom > +to distribute copies of free software (and charge for this service if you > +wish), that you receive source code or can get it if you want it, that you > +can change the software or use pieces of it in new free programs; and that > +you know you can do these things. > + > +To protect your rights, we need to make restrictions that forbid anyone to > +deny you these rights or to ask you to surrender the rights. These > +restrictions translate to certain responsibilities for you if you distribute > +copies of the software, or if you modify it. > + > +For example, if you distribute copies of such a program, whether gratis or > +for a fee, you must give the recipients all the rights that you have. You > +must make sure that they, too, receive or can get the source code. And you > +must show them these terms so they know their rights. > + > +We protect your rights with two steps: (1) copyright the software, and (2) > +offer you this license which gives you legal permission to copy, distribute > +and/or modify the software. > + > +Also, for each author's protection and ours, we want to make certain that > +everyone understands that there is no warranty for this free software. If > +the software is modified by someone else and passed on, we want its > +recipients to know that what they have is not the original, so that any > +problems introduced by others will not reflect on the original authors' > +reputations. > + > +Finally, any free program is threatened constantly by software patents. We > +wish to avoid the danger that redistributors of a free program will > +individually obtain patent licenses, in effect making the program > +proprietary. To prevent this, we have made it clear that any patent must be > +licensed for everyone's free use or not licensed at all. > + > +The precise terms and conditions for copying, distribution and modification > +follow. > + > +TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION > + > +0. This License applies to any program or other work which contains a notice > + placed by the copyright holder saying it may be distributed under the > + terms of this General Public License. The "Program", below, refers to any > + such program or work, and a "work based on the Program" means either the > + Program or any derivative work under copyright law: that is to say, a > + work containing the Program or a portion of it, either verbatim or with > + modifications and/or translated into another language. (Hereinafter, > + translation is included without limitation in the term "modification".) > + Each licensee is addressed as "you". > + > + Activities other than copying, distribution and modification are not > + covered by this License; they are outside its scope. The act of running > + the Program is not restricted, and the output from the Program is covered > + only if its contents constitute a work based on the Program (independent > + of having been made by running the Program). Whether that is true depends > + on what the Program does. > + > +1. You may copy and distribute verbatim copies of the Program's source code > + as you receive it, in any medium, provided that you conspicuously and > + appropriately publish on each copy an appropriate copyright notice and > + disclaimer of warranty; keep intact all the notices that refer to this > + License and to the absence of any warranty; and give any other recipients > + of the Program a copy of this License along with the Program. > + > + You may charge a fee for the physical act of transferring a copy, and you > + may at your option offer warranty protection in exchange for a fee. > + > +2. You may modify your copy or copies of the Program or any portion of it, > + thus forming a work based on the Program, and copy and distribute such > + modifications or work under the terms of Section 1 above, provided that > + you also meet all of these conditions: > + > + * a) You must cause the modified files to carry prominent notices stating > + that you changed the files and the date of any change. > + > + * b) You must cause any work that you distribute or publish, that in > + whole or in part contains or is derived from the Program or any part > + thereof, to be licensed as a whole at no charge to all third parties > + under the terms of this License. > + > + * c) If the modified program normally reads commands interactively when > + run, you must cause it, when started running for such interactive > + use in the most ordinary way, to print or display an announcement > + including an appropriate copyright notice and a notice that there is > + no warranty (or else, saying that you provide a warranty) and that > + users may redistribute the program under these conditions, and > + telling the user how to view a copy of this License. (Exception: if > + the Program itself is interactive but does not normally print such > + an announcement, your work based on the Program is not required to > + print an announcement.) > + > + These requirements apply to the modified work as a whole. If identifiable > + sections of that work are not derived from the Program, and can be > + reasonably considered independent and separate works in themselves, then > + this License, and its terms, do not apply to those sections when you > + distribute them as separate works. But when you distribute the same > + sections as part of a whole which is a work based on the Program, the > + distribution of the whole must be on the terms of this License, whose > + permissions for other licensees extend to the entire whole, and thus to > + each and every part regardless of who wrote it. > + > + Thus, it is not the intent of this section to claim rights or contest > + your rights to work written entirely by you; rather, the intent is to > + exercise the right to control the distribution of derivative or > + collective works based on the Program. > + > + In addition, mere aggregation of another work not based on the Program > + with the Program (or with a work based on the Program) on a volume of a > + storage or distribution medium does not bring the other work under the > + scope of this License. > + > +3. You may copy and distribute the Program (or a work based on it, under > + Section 2) in object code or executable form under the terms of Sections > + 1 and 2 above provided that you also do one of the following: > + > + * a) Accompany it with the complete corresponding machine-readable source > + code, which must be distributed under the terms of Sections 1 and 2 > + above on a medium customarily used for software interchange; or, > + > + * b) Accompany it with a written offer, valid for at least three years, > + to give any third party, for a charge no more than your cost of > + physically performing source distribution, a complete machine- > + readable copy of the corresponding source code, to be distributed > + under the terms of Sections 1 and 2 above on a medium customarily > + used for software interchange; or, > + > + * c) Accompany it with the information you received as to the offer to > + distribute corresponding source code. (This alternative is allowed > + only for noncommercial distribution and only if you received the > + program in object code or executable form with such an offer, in > + accord with Subsection b above.) > + > + The source code for a work means the preferred form of the work for > + making modifications to it. For an executable work, complete source code > + means all the source code for all modules it contains, plus any > + associated interface definition files, plus the scripts used to control > + compilation and installation of the executable. However, as a special > + exception, the source code distributed need not include anything that is > + normally distributed (in either source or binary form) with the major > + components (compiler, kernel, and so on) of the operating system on which > + the executable runs, unless that component itself accompanies the > + executable. > + > + If distribution of executable or object code is made by offering access > + to copy from a designated place, then offering equivalent access to copy > + the source code from the same place counts as distribution of the source > + code, even though third parties are not compelled to copy the source > + along with the object code. > + > +4. You may not copy, modify, sublicense, or distribute the Program except as > + expressly provided under this License. Any attempt otherwise to copy, > + modify, sublicense or distribute the Program is void, and will > + automatically terminate your rights under this License. However, parties > + who have received copies, or rights, from you under this License will not > + have their licenses terminated so long as such parties remain in full > + compliance. > + > +5. You are not required to accept this License, since you have not signed > + it. However, nothing else grants you permission to modify or distribute > + the Program or its derivative works. These actions are prohibited by law > + if you do not accept this License. Therefore, by modifying or > + distributing the Program (or any work based on the Program), you > + indicate your acceptance of this License to do so, and all its terms and > + conditions for copying, distributing or modifying the Program or works > + based on it. > + > +6. Each time you redistribute the Program (or any work based on the > + Program), the recipient automatically receives a license from the > + original licensor to copy, distribute or modify the Program subject to > + these terms and conditions. You may not impose any further restrictions > + on the recipients' exercise of the rights granted herein. You are not > + responsible for enforcing compliance by third parties to this License. > + > +7. If, as a consequence of a court judgment or allegation of patent > + infringement or for any other reason (not limited to patent issues), > + conditions are imposed on you (whether by court order, agreement or > + otherwise) that contradict the conditions of this License, they do not > + excuse you from the conditions of this License. If you cannot distribute > + so as to satisfy simultaneously your obligations under this License and > + any other pertinent obligations, then as a consequence you may not > + distribute the Program at all. For example, if a patent license would > + not permit royalty-free redistribution of the Program by all those who > + receive copies directly or indirectly through you, then the only way you > + could satisfy both it and this License would be to refrain entirely from > + distribution of the Program. > + > + If any portion of this section is held invalid or unenforceable under any > + particular circumstance, the balance of the section is intended to apply > + and the section as a whole is intended to apply in other circumstances. > + > + It is not the purpose of this section to induce you to infringe any > + patents or other property right claims or to contest validity of any > + such claims; this section has the sole purpose of protecting the > + integrity of the free software distribution system, which is implemented > + by public license practices. Many people have made generous contributions > + to the wide range of software distributed through that system in > + reliance on consistent application of that system; it is up to the > + author/donor to decide if he or she is willing to distribute software > + through any other system and a licensee cannot impose that choice. > + > + This section is intended to make thoroughly clear what is believed to be > + a consequence of the rest of this License. > + > +8. If the distribution and/or use of the Program is restricted in certain > + countries either by patents or by copyrighted interfaces, the original > + copyright holder who places the Program under this License may add an > + explicit geographical distribution limitation excluding those countries, > + so that distribution is permitted only in or among countries not thus > + excluded. In such case, this License incorporates the limitation as if > + written in the body of this License. > + > +9. The Free Software Foundation may publish revised and/or new versions of > + the General Public License from time to time. Such new versions will be > + similar in spirit to the present version, but may differ in detail to > + address new problems or concerns. > + > + Each version is given a distinguishing version number. If the Program > + specifies a version number of this License which applies to it and "any > + later version", you have the option of following the terms and > + conditions either of that version or of any later version published by > + the Free Software Foundation. If the Program does not specify a version > + number of this License, you may choose any version ever published by the > + Free Software Foundation. > + > +10. If you wish to incorporate parts of the Program into other free programs > + whose distribution conditions are different, write to the author to ask > + for permission. For software which is copyrighted by the Free Software > + Foundation, write to the Free Software Foundation; we sometimes make > + exceptions for this. Our decision will be guided by the two goals of > + preserving the free status of all derivatives of our free software and > + of promoting the sharing and reuse of software generally. > + > + NO WARRANTY > + > +11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY > + FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN > + OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES > + PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER > + EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED > + WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE > + ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH > + YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL > + NECESSARY SERVICING, REPAIR OR CORRECTION. > + > +12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING > + WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR > + REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR > + DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL > + DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM > + (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED > + INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF > + THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR > + OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. > + > +END OF TERMS AND CONDITIONS > + > +How to Apply These Terms to Your New Programs > + > +If you develop a new program, and you want it to be of the greatest > +possible use to the public, the best way to achieve this is to make it free > +software which everyone can redistribute and change under these terms. > + > +To do so, attach the following notices to the program. It is safest to > +attach them to the start of each source file to most effectively convey the > +exclusion of warranty; and each file should have at least the "copyright" > +line and a pointer to where the full notice is found. > + > +one line to give the program's name and an idea of what it does. > +Copyright (C) yyyy name of author > + > +This program is free software; you can redistribute it and/or modify it > +under the terms of the GNU General Public License as published by the Free > +Software Foundation; either version 2 of the License, or (at your option) > +any later version. > + > +This program is distributed in the hope that it will be useful, but WITHOUT > +ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or > +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for > +more details. > + > +You should have received a copy of the GNU General Public License along with > +this program; if not, write to the Free Software Foundation, Inc., 59 > +Temple Place - Suite 330, Boston, MA 02111-1307, USA. > + > +Also add information on how to contact you by electronic and paper mail. > + > +If the program is interactive, make it output a short notice like this when > +it starts in an interactive mode: > + > +Gnomovision version 69, Copyright (C) year name of author Gnomovision comes > +with ABSOLUTELY NO WARRANTY; for details type 'show w'. This is free > +software, and you are welcome to redistribute it under certain conditions; > +type 'show c' for details. > + > +The hypothetical commands 'show w' and 'show c' should show the appropriate > +parts of the General Public License. Of course, the commands you use may be > +called something other than 'show w' and 'show c'; they could even be > +mouse-clicks or menu items--whatever suits your program. > + > +You should also get your employer (if you work as a programmer) or your > +school, if any, to sign a "copyright disclaimer" for the program, if > +necessary. Here is a sample; alter the names: > + > +Yoyodyne, Inc., hereby disclaims all copyright interest in the program > +'Gnomovision' (which makes passes at compilers) written by James Hacker. > + > +signature of Ty Coon, 1 April 1989 > +Ty Coon, President of Vice > + > +This General Public License does not permit incorporating your program into > +proprietary programs. If your program is a subroutine library, you may > +consider it more useful to permit linking proprietary applications with the > +library. If this is what you want to do, use the GNU Library General Public > +License instead of this License. > diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/Makefile b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/Makefile > new file mode 100644 > index 000000000000..19afb983ea70 > --- /dev/null > +++ b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/Makefile > @@ -0,0 +1,3 @@ > +obj-$(CONFIG_HYPERV_INFINIBAND_ND) += hv_network_direct_144.0.o > + > +hv_network_direct_144.0-y := provider.o vmbus_rdma.o hvnd_addr.o > diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/hvnd_addr.c b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/hvnd_addr.c > new file mode 100644 > index 000000000000..c8c2295b63af > --- /dev/null > +++ b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/hvnd_addr.c > @@ -0,0 +1,136 @@ > +/* > + * Copyright (c) 2014, Microsoft Corporation. > + * > + * Author: > + * K. Y. Srinivasan <kys@microsoft.com> > + * > + * This program is free software; you can redistribute it and/or modify it > + * under the terms of the GNU General Public License version 2 as published > + * by the Free Software Foundation. > + * > + * This program is distributed in the hope that it will be useful, but > + * WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or > + * NON INFRINGEMENT. See the GNU General Public License for more > + * details. > + * > + * Bug fixes/enhancements: Long Li <longli@microsoft.com> > + */ > + > +#include <linux/completion.h> > +#include <linux/module.h> > +#include <linux/errno.h> > +#include <linux/hyperv.h> > +#include <linux/efi.h> > +#include <linux/slab.h> > +#include <linux/cred.h> > +#include <linux/sched.h> > +#include <linux/types.h> > +#include <linux/completion.h> > +#include <linux/scatterlist.h> > +#include <rdma/ib_umem.h> > +#include <rdma/ib_addr.h> > + > +#include "vmbus_rdma.h" > + > + > +#include <linux/semaphore.h> > +#include <linux/fs.h> > +#include <linux/nls.h> > +#include <linux/workqueue.h> > +#include <linux/cdev.h> > +#include <linux/hyperv.h> > +#include <linux/sched.h> > +#include <linux/uaccess.h> > +#include <linux/miscdevice.h> > +#include <linux/hyperv.h> > + > + > +int hvnd_get_outgoing_rdma_addr(struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx, > + union nd_sockaddr_inet *og_addr) > +{ > + int ret; > + /* > + * Query the host and select the first address. > + */ > + struct pkt_query_addr_list pkt; > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + > + hvnd_init_hdr(&pkt.hdr, > + (sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1)), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST, 0, 0, 0); > + > + pkt.ioctl.in.version = ND_VERSION_1; > + pkt.ioctl.in.reserved = 0; > + pkt.ioctl.in.handle = uctx->adaptor_hdl; > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, > + sizeof(pkt), (u64)&pkt); > + > + if (ret) > + return ret; > + > + /* > + * Copy the address out. > + */ > + > + memcpy(og_addr, &pkt.ioctl.out[0], sizeof(*og_addr)); > + return 0; > + > +} > + > +static struct rdma_addr_client self; > + > +struct resolve_cb_context { > + struct rdma_dev_addr *addr; > + struct completion comp; > +}; > + > +void hvnd_addr_init(void) > +{ > + rdma_addr_register_client(&self); > +} > + > +void hvnd_addr_deinit(void) > +{ > + rdma_addr_unregister_client(&self); > +} > + > +static void resolve_cb(int status, struct sockaddr *src_addr, > + struct rdma_dev_addr *addr, void *context) > +{ > + memcpy(((struct resolve_cb_context *)context)->addr, addr, sizeof(struct > + rdma_dev_addr)); > + complete(&((struct resolve_cb_context *)context)->comp); > +} > + > +int hvnd_get_neigh_mac_addr(struct sockaddr *local, struct sockaddr *remote, > + char *mac_addr) > +{ > + struct rdma_dev_addr dev_addr; > + struct resolve_cb_context ctx; > + int ret; > + > + memset(&dev_addr, 0, sizeof(dev_addr)); > + dev_addr.net = &init_net; > + ctx.addr = &dev_addr; > + init_completion(&ctx.comp); > + > + ret = rdma_resolve_ip(&self, local, remote, &dev_addr, 1000, > + resolve_cb, &ctx); > + > + if (ret) { > + hvnd_error("rdma_resolve_ip failed ret=%d\n", ret); > + return ret; > + } > + > + wait_for_completion(&ctx.comp); > + memcpy(mac_addr, dev_addr.dst_dev_addr, ETH_ALEN); > + return ret; > +} > diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/mx_abi.h b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/mx_abi.h > new file mode 100644 > index 000000000000..8f55cb408de0 > --- /dev/null > +++ b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/mx_abi.h > @@ -0,0 +1,232 @@ > +/* > + * Copyright (c) 2005 Topspin Communications. All rights reserved. > + * Copyright (c) 2005 Cisco Systems. All rights reserved. > + * Copyright (c) 2005 PathScale, Inc. All rights reserved. > + * > + * This software is available to you under a choice of one of two > + * licenses. You may choose to be licensed under the terms of the GNU > + * General Public License (GPL) Version 2, available from the file > + * COPYING in the main directory of this source tree, or the > + * OpenIB.org BSD license below: > + * > + * Redistribution and use in source and binary forms, with or > + * without modification, are permitted provided that the following > + * conditions are met: > + * > + * - Redistributions of source code must retain the above > + * copyright notice, this list of conditions and the following > + * disclaimer. > + * > + * - Redistributions in binary form must reproduce the above > + * copyright notice, this list of conditions and the following > + * disclaimer in the documentation and/or other materials > + * provided with the distribution. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, > + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF > + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND > + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS > + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN > + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN > + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE > + * SOFTWARE. > + * > + * KYS: made some modifications. > + */ > + > +#ifndef MX_ABI_H > +#define MX_ABI_H > + > + > + > +/* > + * Make sure that all structs defined in this file remain laid out so > + * that they pack the same way on 32-bit and 64-bit architectures (to > + * avoid incompatibility between 32-bit userspace and 64-bit kernels). > + * Specifically: > + * - Do not use pointer types -- pass pointers in UINT64 instead. > + * - Make sure that any structure larger than 4 bytes is padded to a > + * multiple of 8 bytes. Otherwise the structure size will be > + * different between 32-bit and 64-bit architectures. > + */ > + > +enum ibv_get_context_mappings { > + IBV_GET_CONTEXT_UAR, > + IBV_GET_CONTEXT_BF, > + IBV_GET_CONTEXT_MAPPING_MAX > +}; > + > +struct ibv_get_context_req { > + > + union nd_mapping mappings[IBV_GET_CONTEXT_MAPPING_MAX]; > +}; > + > +struct ibv_get_context_resp { > + > + // mmap UAR and BF > + struct nd_mapping_result mapping_results[IBV_GET_CONTEXT_MAPPING_MAX]; > + > + // mmap Blue Flame > + int bf_buf_size; > + int bf_offset; > + > + // mlx4_query_device result > + int max_qp_wr; > + int max_sge; > + int max_cqe; > + > + // general parameters > + u32 cqe_size; > + u32 vend_id; > + u16 dev_id; > + u16 bf_reg_size; > + u16 bf_regs_per_page; > + u16 reserved1; > + > + // ibv_cmd_get_context result > + u32 qp_tab_size; > + > + u32 reserved2; > +}; > + > +struct ibv_alloc_pd_resp { > + u64 pd_handle; > + u32 pdn; > + u32 reserved; > +}; > + > +struct ibv_reg_mr { > + u64 start; > + u64 length; > + u64 hca_va; > + u32 access_flags; > + u32 pdn; > + u64 pd_handle; > +}; > + > +struct ibv_reg_mr_resp { > + u64 mr_handle; > + u32 lkey; > + u32 rkey; > +}; > + > + > +enum mlx4_ib_create_cq_mapping { > + MLX4_IB_CREATE_CQ_BUF, > + MLX4_IB_CREATE_CQ_DB, > + MLX4_IB_CREATE_CQ_ARM_SN, // Windows specific > + MLX4_IB_CREATE_CQ_MAPPING_MAX > +}; > + > +#define MLX4_CQ_FLAGS_ARM_IN_KERNEL 1 > + > +struct ibv_create_cq { > + union nd_mapping mappings[MLX4_IB_CREATE_CQ_MAPPING_MAX]; > + u32 flags; > +}; > + > +struct ibv_create_cq_resp { > + struct nd_mapping_result mapping_results[MLX4_IB_CREATE_CQ_MAPPING_MAX]; > + u32 cqn; > + u32 cqe; > +}; > + > +enum mlx4_ib_create_srq_mappings { > + MLX4_IB_CREATE_SRQ_BUF, > + MLX4_IB_CREATE_SRQ_DB, > + MLX4_IB_CREATE_SRQ_MAPPINGS_MAX > +}; > + > +struct ibv_create_srq { > + union nd_mapping mappings[MLX4_IB_CREATE_SRQ_MAPPINGS_MAX]; > +}; > + > +struct ibv_create_srq_resp { > + struct nd_mapping_result mapping_results[MLX4_IB_CREATE_SRQ_MAPPINGS_MAX]; > +}; > + > +enum mlx4_ib_create_qp_mappings { > + MLX4_IB_CREATE_QP_BUF, > + MLX4_IB_CREATE_QP_DB, > + MLX4_IB_CREATE_QP_MAPPINGS_MAX > +}; > + > +struct ibv_create_qp { > + union nd_mapping mappings[MLX4_IB_CREATE_QP_MAPPINGS_MAX]; > + u8 log_sq_bb_count; > + u8 log_sq_stride; > + u8 sq_no_prefetch; > + u8 reserved; > +}; > + > +struct ibv_create_qp_resp { > + struct nd_mapping_result mapping_results[MLX4_IB_CREATE_QP_MAPPINGS_MAX]; > + // struct ib_uverbs_create_qp_resp > + u64 qp_handle; > + u32 qpn; > + u32 max_send_wr; > + u32 max_recv_wr; > + u32 max_send_sge; > + u32 max_recv_sge; > + u32 max_inline_data; > +}; > + > +enum ibv_qp_attr_mask { > + IBV_QP_STATE = 1 << 0, > + IBV_QP_CUR_STATE = 1 << 1, > + IBV_QP_EN_SQD_ASYNC_NOTIFY = 1 << 2, > + IBV_QP_ACCESS_FLAGS = 1 << 3, > + IBV_QP_PKEY_INDEX = 1 << 4, > + IBV_QP_PORT = 1 << 5, > + IBV_QP_QKEY = 1 << 6, > + IBV_QP_AV = 1 << 7, > + IBV_QP_PATH_MTU = 1 << 8, > + IBV_QP_TIMEOUT = 1 << 9, > + IBV_QP_RETRY_CNT = 1 << 10, > + IBV_QP_RNR_RETRY = 1 << 11, > + IBV_QP_RQ_PSN = 1 << 12, > + IBV_QP_MAX_QP_RD_ATOMIC = 1 << 13, > + IBV_QP_ALT_PATH = 1 << 14, > + IBV_QP_MIN_RNR_TIMER = 1 << 15, > + IBV_QP_SQ_PSN = 1 << 16, > + IBV_QP_MAX_DEST_RD_ATOMIC = 1 << 17, > + IBV_QP_PATH_MIG_STATE = 1 << 18, > + IBV_QP_CAP = 1 << 19, > + IBV_QP_DEST_QPN = 1 << 20 > +}; > + > +enum ibv_qp_state { > + IBV_QPS_RESET, > + IBV_QPS_INIT, > + IBV_QPS_RTR, > + IBV_QPS_RTS, > + IBV_QPS_SQD, > + IBV_QPS_SQE, > + IBV_QPS_ERR > +}; > + > + > +struct ibv_modify_qp_resp { > + enum ibv_qp_attr_mask attr_mask; > + u8 qp_state; > + u8 reserved[3]; > +}; > + > +struct ibv_create_ah_resp { > + u64 start; > +}; > + > +/* > + * Some mlx4 specific kernel definitions. Perhaps could be in > + * separate file. > + */ > + > +struct mlx4_ib_user_db_page { > + struct list_head list; > + struct ib_umem *umem; > + unsigned long user_virt; > + int refcnt; > +}; > + > + > +#endif /* MX_ABI_H */ > diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/provider.c b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/provider.c > new file mode 100644 > index 000000000000..7115cf769fe0 > --- /dev/null > +++ b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/provider.c > @@ -0,0 +1,2961 @@ > +/* > + * Copyright (c) 2014, Microsoft Corporation. > + * > + * Author: > + * K. Y. Srinivasan <kys@microsoft.com> > + * > + * This program is free software; you can redistribute it and/or modify it > + * under the terms of the GNU General Public License version 2 as published > + * by the Free Software Foundation. > + * > + * This program is distributed in the hope that it will be useful, but > + * WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or > + * NON INFRINGEMENT. See the GNU General Public License for more > + * details. > + * > + * Bug fixes/enhancements: Long Li <longli@microsoft.com> > + */ > + > +#include <linux/module.h> > +#include <linux/moduleparam.h> > +#include <linux/device.h> > +#include <linux/netdevice.h> > +#include <linux/etherdevice.h> > +#include <linux/delay.h> > +#include <linux/errno.h> > +#include <linux/list.h> > +#include <linux/spinlock.h> > +#include <linux/ethtool.h> > +#include <linux/rtnetlink.h> > +#include <linux/inetdevice.h> > +#include <linux/io.h> > +#include <linux/hyperv.h> > +#include <linux/completion.h> > +#include <rdma/iw_cm.h> > +#include <rdma/ib_verbs.h> > +#include <rdma/ib_smi.h> > +#include <rdma/ib_umem.h> > +#include <rdma/ib_user_verbs.h> > +#include <rdma/rdma_cm.h> > +#include <linux/proc_fs.h> > +#include <linux/seq_file.h> > +#include "vmbus_rdma.h" > + > +/* > + * We are emulating mlx4. XXXKYS: May have to FIX. > + */ > +#include "./user.h" > + > +static struct hvnd_dev *g_nd_dev; // the one and only one > + > +int hvnd_log_level = HVND_ERROR; > +module_param(hvnd_log_level, int, 0644); > +MODULE_PARM_DESC(hvnd_log_level, > + "Logging level, 0 - Error (default), 1 - Warning, 2 - Info, 3 - Debug."); > + > +static int disable_cq_notify = 1; > +//static int disable_cq_notify = 0; > +module_param(disable_cq_notify, int, 0644); > +MODULE_PARM_DESC(disable_cq_notify, > + "Disable CQ notification, 0 - Enable, 1 - Disable (default)."); > + > +enum { > + MLX4_USER_DEV_CAP_64B_CQE = 1L << 0 > +}; > + > +#define HVND_NODE_DESC "vmbus-RDMA" > + > +#undef MLX4_IB_UVERBS_ABI_VERSION > +#define MLX4_IB_UVERBS_ABI_VERSION 4 > + > +struct mlx4_wqe_data_seg { > + __be32 byte_count; > + __be32 lkey; > + __be64 addr; > +}; > + > +/* return value: > + * true: ep is running > + * false: ep is stopped > + */ > +bool ep_add_work_pending(struct hvnd_ep_obj *ep_object) > +{ > + bool ret = true; > + > + atomic_inc(&ep_object->nr_requests_pending); > + if (ep_object->stopping) { > + if (atomic_dec_and_test(&ep_object->nr_requests_pending)) > + wake_up(&ep_object->wait_pending); > + ret = false; > + } > + return ret; > +} > + > +void ep_del_work_pending(struct hvnd_ep_obj *ep_object) > +{ > + if (atomic_dec_and_test(&ep_object->nr_requests_pending)) > + wake_up(&ep_object->wait_pending); > + > + if (atomic_read(&ep_object->nr_requests_pending) < 0) { > + hvnd_error("ep_object->nr_requests_pending=%d type=%d cm_state=%d\n", > + atomic_read(&ep_object->nr_requests_pending), > + ep_object->type, ep_object->cm_state); > + dump_stack(); > + } > +} > + > +void ep_stop(struct hvnd_ep_obj *ep_object) > +{ > + if (!ep_object->stopping) { > + ep_object->stopping = true; > + hvnd_cancel_io(ep_object); > + } > + > + if (atomic_read(&ep_object->nr_requests_pending) < 0) { > + hvnd_error("IO canceled, ep_object->nr_requests_pending=%d type=%d cm_state=%d\n", > + atomic_read(&ep_object->nr_requests_pending), > + ep_object->type, ep_object->cm_state); > + dump_stack(); > + } > + > + wait_event(ep_object->wait_pending, > + !atomic_read(&ep_object->nr_requests_pending)); > +} > + > +static int vmbus_dma_map_sg(struct device *dev, struct scatterlist *sgl, > + int nents, enum dma_data_direction direction, > + unsigned long attrs) > +{ > + struct scatterlist *sg; > + u64 addr; > + int i; > + int ret = nents; > + > + BUG_ON(!valid_dma_direction(direction)); > + > + for_each_sg(sgl, sg, nents, i) { > + addr = (u64) page_address(sg_page(sg)); > + /* TODO: handle highmem pages */ > + if (!addr) { > + ret = 0; > + break; > + } > + sg->dma_address = addr + sg->offset; > + sg->dma_length = sg->length; > + } > + return ret; > +} > + > +static void vmbus_dma_unmap_sg(struct device *dev, > + struct scatterlist *sg, int nents, > + enum dma_data_direction direction, unsigned long attrs) > +{ > + BUG_ON(!valid_dma_direction(direction)); > +} > + > + > +const struct dma_map_ops vmbus_dma_ops = { > + .map_sg = vmbus_dma_map_sg, > + .unmap_sg = vmbus_dma_unmap_sg, > +}; > + > +static int hvnd_get_incoming_connections(struct hvnd_ep_obj *listener, > + struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx); > + > +static struct hvnd_ep_obj *hvnd_setup_ep(struct iw_cm_id *cm_id, int ep_type, > + struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx); > + > +static void hvnd_deinit_ep(struct hvnd_ep_obj *ep) > +{ > + put_irp_handle(ep->nd_dev, ep->local_irp); > +} > + > +static void hvnd_destroy_ep(struct hvnd_ep_obj *ep) > +{ > + hvnd_debug("canceling work for ep %p\n", ep); > + cancel_work_sync(&ep->wrk.work); > + hvnd_deinit_ep(ep); > + kfree(ep); > +} > + > + > +#define UC(b) (((int)b)&0xff) > +char *debug_inet_ntoa(struct in_addr in, char *b) > +{ > + register char *p; > + > + p = (char *)∈ > + (void)snprintf(b, 20, > + "%d.%d.%d.%d", UC(p[0]), UC(p[1]), UC(p[2]), UC(p[3])); > + return b; > +} > + > +void hvnd_process_events(struct work_struct *work); > + > +static int hvnd_init_ep(struct hvnd_ep_obj *ep_object, > + struct iw_cm_id *cm_id, int ep_type, > + struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx) > +{ > + int ret; > + > + ep_object->type = ep_type; > + ep_object->cm_id = cm_id; > + ep_object->nd_dev = nd_dev; > + ep_object->uctx = uctx; > + > + ep_object->parent = NULL; > + > + ep_object->wrk.callback_arg = ep_object; > + INIT_WORK(&ep_object->wrk.work, hvnd_process_events); > + INIT_LIST_HEAD(&ep_object->incoming_pkt_list); > + spin_lock_init(&ep_object->incoming_pkt_list_lock); > + > + ep_object->stopping = false; > + atomic_set(&ep_object->nr_requests_pending, 0); > + init_waitqueue_head(&ep_object->wait_pending); > + > + ret = get_irp_handle(nd_dev, &ep_object->local_irp, (void *)ep_object); > + > + if (ret) { > + hvnd_error("get_irp_handle() failed: err: %d\n", ret); > + return ret; > + } > + return 0; > + > +} > + > +static int set_rq_size(struct hvnd_dev *dev, struct ib_qp_cap *cap, > + struct hvnd_qp *qp) > +{ > + > + /* HW requires >= 1 RQ entry with >= 1 gather entry */ > + if (!cap->max_recv_wr || !cap->max_recv_sge) > + return -EINVAL; > + > + qp->rq_wqe_cnt = roundup_pow_of_two(max(1U, cap->max_recv_wr)); > + qp->rq_max_gs = roundup_pow_of_two(max(1U, cap->max_recv_sge)); > + qp->rq_wqe_shift = ilog2( > + qp->rq_max_gs * sizeof(struct mlx4_wqe_data_seg)); > + > + > + return 0; > +} > + > +static int set_user_sq_size(struct hvnd_dev *dev, > + struct hvnd_qp *qp, > + struct mlx4_ib_create_qp *ucmd) > +{ > + qp->sq_wqe_cnt = 1 << ucmd->log_sq_bb_count; > + qp->sq_wqe_shift = ucmd->log_sq_stride; > + > + qp->buf_size = (qp->rq_wqe_cnt << qp->rq_wqe_shift) + > + (qp->sq_wqe_cnt << qp->sq_wqe_shift); > + > + return 0; > +} > + > +static int hvnd_db_map_user(struct hvnd_ucontext *uctx, unsigned long virt, > + struct ib_umem **db_umem) > +{ > + struct mlx4_ib_user_db_page *page; > + int err = 0; > + > + mutex_lock(&uctx->db_page_mutex); > + > + list_for_each_entry(page, &uctx->db_page_list, list) > + if (page->user_virt == (virt & PAGE_MASK)) > + goto found; > + > + page = kmalloc(sizeof(*page), GFP_KERNEL); > + if (!page) { > + err = -ENOMEM; > + goto out; > + } > + > + page->user_virt = (virt & PAGE_MASK); > + page->refcnt = 0; > + page->umem = ib_umem_get(&uctx->ibucontext, virt & PAGE_MASK, > + PAGE_SIZE, 0, 0); > + if (IS_ERR(page->umem)) { > + hvnd_error("ib_umem_get failure\n"); > + err = PTR_ERR(page->umem); > + kfree(page); > + goto out; > + } > + > + list_add(&page->list, &uctx->db_page_list); > + > +found: > + ++page->refcnt; > +out: > + mutex_unlock(&uctx->db_page_mutex); > + if (!err) > + *db_umem = page->umem; > + > + return err; > +} > + > +static void hvnd_db_unmap_user(struct hvnd_ucontext *uctx, u64 db_addr) > +{ > + struct mlx4_ib_user_db_page *page; > + > + mutex_lock(&uctx->db_page_mutex); > + list_for_each_entry(page, &uctx->db_page_list, list) > + if (page->user_virt == (db_addr & PAGE_MASK)) > + goto found; > + > +found: > + if (!--page->refcnt) { > + list_del(&page->list); > + ib_umem_release(page->umem); > + kfree(page); > + } > + > + mutex_unlock(&uctx->db_page_mutex); > +} > + > + > +static void debug_check(const char *func, int line) > +{ > + hvnd_debug("func is: %s; line is %d\n", func, line); > + > + if (in_interrupt()) { > + hvnd_error("In interrupt func is: %s; line is %d\n", > + func, line); > + return; > + } > +} > + > +static struct ib_ah *hvnd_ah_create(struct ib_pd *pd, > + struct ib_ah_attr *ah_attr, > + struct ib_udata *udata) > +{ > + debug_check(__func__, __LINE__); > + return ERR_PTR(-ENOSYS); > +} > + > +static int hvnd_ah_destroy(struct ib_ah *ah) > +{ > + debug_check(__func__, __LINE__); > + return -ENOSYS; > +} > + > +static int hvnd_multicast_attach(struct ib_qp *ibqp, union ib_gid *gid, u16 lid) > +{ > + debug_check(__func__, __LINE__); > + return -ENOSYS; > +} > + > +static int hvnd_multicast_detach(struct ib_qp *ibqp, union ib_gid *gid, u16 lid) > +{ > + debug_check(__func__, __LINE__); > + return -ENOSYS; > +} > + > +static int hvnd_process_mad(struct ib_device *ibdev, > + int mad_flags, > + u8 port_num, > + const struct ib_wc *in_wc, > + const struct ib_grh *in_grh, > + const struct ib_mad_hdr *in_mad, > + size_t in_mad_size, > + struct ib_mad_hdr *out_mad, > + size_t *out_mad_size, > + u16 *out_mad_pkey_index) > +{ > + debug_check(__func__, __LINE__); > + return -ENOSYS; > +} > + > +void hvnd_acquire_uctx_ref(struct hvnd_ucontext *uctx) > +{ > + atomic_inc(&uctx->refcnt); > +} > + > +void hvnd_drop_uctx_ref(struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx) > +{ > + if (atomic_dec_and_test(&uctx->refcnt)) { > + hvnd_debug("uctx ref cnt dropped it is %d\n", > + atomic_read(&uctx->refcnt)); > + hvnd_debug("About to close adaptor\n"); > + hvnd_close_adaptor(nd_dev, uctx); > + } else > + hvnd_debug("uctx ref cnt dropped it is %d\n", > + atomic_read(&uctx->refcnt)); > +} > + > + > +static int hvnd_dealloc_ucontext(struct ib_ucontext *context) > +{ > + struct hvnd_dev *nd_dev; > + struct hvnd_ucontext *uctx; > + > + uctx = to_nd_context(context); > + nd_dev = to_nd_dev(context->device); > + > + hvnd_debug("calling %s\n", __func__); > + > + hvnd_drop_uctx_ref(nd_dev, uctx); > + > + return 0; > +} > + > +static struct ib_ucontext *hvnd_alloc_ucontext(struct ib_device *ibdev, > + struct ib_udata *udata) > +{ > + struct hvnd_dev *nd_dev = to_nd_dev(ibdev); > + struct hvnd_ucontext *uctx; > + struct mlx4_ib_alloc_ucontext_resp resp; > + struct ibv_get_context_resp *ctx_output; > + int ret; > + > + if (!nd_dev->ib_active) { > + hvnd_error("ib device is not active, try again\n"); > + return ERR_PTR(-EAGAIN); > + } > + > + uctx = get_uctx(nd_dev, current_pid()); > + if (uctx) { > + // it is already opened, just increase its reference count > + hvnd_acquire_uctx_ref(uctx); > + } else { > + > + /* > + * The Windows host expects the following to be done: > + * 1. Successfully send struct ndv_pkt_hdr_create_1 > + * 2. INIT PROVIDER > + * 3. Open Adapter > + * Before we can complete this call. > + */ > + > + uctx = kzalloc(sizeof(struct hvnd_ucontext), GFP_KERNEL); > + if (!uctx) > + return ERR_PTR(-ENOMEM); > + > + atomic_set(&uctx->refcnt, 1); > + INIT_LIST_HEAD(&uctx->db_page_list); > + mutex_init(&uctx->db_page_mutex); > + > + /* > + * Stash away the context with the calling PID. > + */ > + ret = insert_handle(nd_dev, &nd_dev->uctxidr, uctx, > + current_pid()); > + if (ret) { > + hvnd_error("Uctx ID insertion failed; ret is %d\n", > + ret); > + goto err1; > + } > + > + hvnd_debug("Opening adaptor pid is %d\n", current_pid()); > + > + ret = hvnd_open_adaptor(nd_dev, uctx); > + if (ret) { > + hvnd_error("hvnd_open_adaptor failed ret=%d\n", ret); > + goto err1; > + } > + > + } > + > + /* > + * Copy the response out. > + */ > + ctx_output = &uctx->o_adap_pkt.mappings.ctx_output; > + resp.dev_caps = MLX4_USER_DEV_CAP_64B_CQE; > + resp.qp_tab_size = ctx_output->qp_tab_size; > + resp.bf_reg_size = ctx_output->bf_reg_size; > + resp.bf_regs_per_page = ctx_output->bf_regs_per_page; > + resp.cqe_size = ctx_output->cqe_size; > + > + ret = ib_copy_to_udata(udata, &resp, sizeof(resp)); > + if (ret) { > + hvnd_error("ib_copy_to_udata failed ret=%d\n", ret); > + goto err1; > + } > + > + return &uctx->ibucontext; > + > +err1: > + kfree(uctx); > + return ERR_PTR(ret); > +} > + > +static int hvnd_mmap(struct ib_ucontext *context, struct vm_area_struct *vma) > +{ > + struct hvnd_ucontext *uctx = to_nd_context(context); > + > + if (vma->vm_end - vma->vm_start != PAGE_SIZE) { > + hvnd_error("vma not a page size, actual size=%lu\n", > + vma->vm_end - vma->vm_start); > + return -EINVAL; > + } > + > + if (vma->vm_pgoff == 0) { > + vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); > + > + if (io_remap_pfn_range(vma, vma->vm_start, > + (uctx->uar_base >> PAGE_SHIFT), > + PAGE_SIZE, vma->vm_page_prot)) { > + hvnd_error("io_remap_pfn_range failure\n"); > + return -EAGAIN; > + } > + } else if (vma->vm_pgoff == 1 && uctx->bf_buf_size != 0) { > + vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot); > + > + if (io_remap_pfn_range(vma, vma->vm_start, > + (uctx->uar_base >> PAGE_SHIFT) + 1, > + PAGE_SIZE, vma->vm_page_prot)) { > + hvnd_error("io_remap_pfn_range failure\n"); > + return -EAGAIN; > + } > + } else { > + hvnd_error("check code\n"); > + return -EINVAL; > + } > + > + return 0; > +} > + > +static int hvnd_deallocate_pd(struct ib_pd *pd) > +{ > + > + struct hvnd_ucontext *uctx; > + struct hvnd_dev *nd_dev; > + struct hvnd_ib_pd *hvnd_pd; > + struct ib_ucontext *ibuctx = pd->uobject->context; > + > + hvnd_pd = to_nd_pd(pd); > + nd_dev = to_nd_dev(pd->device); > + uctx = to_nd_context(ibuctx); > + > + hvnd_free_handle(nd_dev, uctx, hvnd_pd->handle, > + IOCTL_ND_PD_FREE); > + > + hvnd_drop_uctx_ref(nd_dev, uctx); > + return 0; > +} > + > +static struct ib_pd *hvnd_allocate_pd(struct ib_device *ibdev, > + struct ib_ucontext *context, > + struct ib_udata *udata) > +{ > + struct hvnd_ucontext *uctx; > + struct hvnd_dev *nd_dev; > + int ret; > + struct hvnd_ib_pd *hvnd_pd; > + > + if (!context) { > + hvnd_error("kernel mode context not supported\n"); > + return ERR_PTR(-EINVAL); > + } > + > + hvnd_pd = kzalloc(sizeof(struct hvnd_ib_pd), GFP_KERNEL); > + > + if (!hvnd_pd) > + return ERR_PTR(-ENOMEM); > + > + uctx = to_nd_context(context); > + nd_dev = to_nd_dev(ibdev); > + > + ret = hvnd_create_pd(uctx, nd_dev, hvnd_pd); > + if (ret) { > + hvnd_error("hvnd_create_pd failure ret=%d\n", ret); > + goto error_cr_pd; > + } > + > + if (context) { > + if (ib_copy_to_udata(udata, &hvnd_pd->pdn, sizeof(__u32))) { > + hvnd_error("ib_copy_to_udata failure\n"); > + ret = -EFAULT; > + goto error_fault; > + } > + } > + > + hvnd_acquire_uctx_ref(uctx); > + return &hvnd_pd->ibpd; > + > +error_fault: > + hvnd_free_handle(nd_dev, uctx, hvnd_pd->handle, > + IOCTL_ND_PD_FREE); > + > +error_cr_pd: > + kfree(hvnd_pd); > + return ERR_PTR(ret); > +} > + > +static int hvnd_query_pkey(struct ib_device *ibdev, u8 port, u16 index, > + u16 *pkey) > +{ > + debug_check(__func__, __LINE__); > + *pkey = 0; > + return 0; > +} > + > +static int hvnd_query_gid(struct ib_device *ibdev, u8 port, int index, > + union ib_gid *gid) > +{ > + int ret; > + struct hvnd_dev *nd_dev = to_nd_dev(ibdev); > + > + debug_check(__func__, __LINE__); > + > + ret = wait_for_completion_timeout(&nd_dev->addr_set, 60*HZ); > + if (!ret) > + return -ETIMEDOUT; > + > + memset(&(gid->raw[0]), 0, sizeof(gid->raw)); > + memcpy(&(gid->raw[0]), nd_dev->mac_addr, 6); > + return 0; > +} > + > +static int hvnd_query_device(struct ib_device *ibdev, > + struct ib_device_attr *props, > + struct ib_udata *udata) > +{ > + struct hvnd_dev *nd_dev = to_nd_dev(ibdev); > + struct adapter_info_v2 *adap_info; > + > + if (!nd_dev->query_pkt_set) { > + hvnd_error("query packet not received yet\n"); > + return 0; > + } > + > + adap_info = &nd_dev->query_pkt.ioctl.ad_info; > + > + memset(props, 0, sizeof(*props)); > + > + /* > + * Copy the relevant properties out. > + */ > + props->fw_ver = 0; > + props->device_cap_flags = 0; > + //props->device_cap_flags |= IB_DEVICE_BAD_PKEY_CNTR; > + //props->device_cap_flags |= IB_DEVICE_BAD_QKEY_CNTR; > + //props->device_cap_flags |= IB_DEVICE_XRC; > + > + props->vendor_id = 0x15b3; > + props->vendor_part_id = adap_info->device_id; > + > + props->max_mr_size = ~0ull; > + props->page_size_cap = PAGE_SIZE; > + props->max_qp = 16384; > + props->max_qp_wr = min(adap_info->max_recv_q_depth, > + adap_info->max_initiator_q_depth); > + > + props->max_sge = min(adap_info->max_initiator_sge, > + adap_info->max_recv_sge); > + props->max_cq = 0x1FFFF; > + props->max_cqe = adap_info->max_completion_q_depth; > + props->max_mr = 16384; > + props->max_pd = 16384; > + > + props->max_qp_rd_atom = adap_info->max_inbound_read_limit; > + props->max_qp_init_rd_atom = adap_info->max_outbound_read_limit; > + props->max_res_rd_atom = props->max_qp_rd_atom * props->max_qp; > + props->max_srq = 16384; > + props->max_srq_wr = adap_info->max_recv_q_depth; > + props->max_srq_sge = adap_info->max_recv_sge; > + > + return 0; > +} > + > +static int hvnd_query_port(struct ib_device *ibdev, u8 port, > + struct ib_port_attr *props) > +{ > + memset(props, 0, sizeof(struct ib_port_attr)); > + > + props->max_mtu = IB_MTU_4096; > + props->active_mtu = IB_MTU_4096; > + > + /* > + * KYS: TBD need to base this on netdev. > + */ > + props->state = IB_PORT_ACTIVE; > + > + props->port_cap_flags = IB_PORT_CM_SUP; > + > + props->gid_tbl_len = 1; > + props->pkey_tbl_len = 1; > + props->active_width = 1; > + props->active_speed = IB_SPEED_DDR; //KYS: check > + props->max_msg_sz = -1; > + > + return 0; > +} > + > +static enum rdma_link_layer > +hvnd_get_link_layer(struct ib_device *device, u8 port) > +{ > + return IB_LINK_LAYER_ETHERNET; > +} > + > +static ssize_t hvnd_show_rev(struct device *dev, struct device_attribute *attr, > + char *buf) > +{ > + return 0; > +} > + > +static ssize_t hvnd_show_fw_ver(struct device *dev, > + struct device_attribute *attr, char *buf) > +{ > + return 0; > +} > + > +static ssize_t hvnd_show_hca(struct device *dev, struct device_attribute *attr, > + char *buf) > +{ > + return 0; > +} > + > +static ssize_t hvnd_show_board(struct device *dev, > + struct device_attribute *attr, char *buf) > +{ > + return 0; > +} > + > +static int hvnd_get_port_immutable(struct ib_device *ibdev, u8 port_num, > + struct ib_port_immutable *immutable) > +{ > + struct ib_port_attr attr; > + int err; > + > + err = hvnd_query_port(ibdev, port_num, &attr); > + if (err) > + return err; > + > + immutable->pkey_tbl_len = attr.pkey_tbl_len; > + immutable->gid_tbl_len = attr.gid_tbl_len; > + immutable->core_cap_flags = RDMA_CORE_PORT_IWARP; > + > + return 0; > +} > + > +static struct ib_qp *hvnd_ib_create_qp(struct ib_pd *pd, > + struct ib_qp_init_attr *attrs, > + struct ib_udata *udata) > +{ > + struct hvnd_ucontext *uctx; > + struct hvnd_dev *nd_dev; > + struct mlx4_ib_create_qp ucmd; > + struct hvnd_qp *qp; > + int ret = 0; > + struct hvnd_ib_pd *hvnd_pd = to_nd_pd(pd); > + struct hvnd_cq *send_cq = to_nd_cq(attrs->send_cq); > + struct hvnd_cq *recv_cq = to_nd_cq(attrs->recv_cq); > + > + uctx = get_uctx_from_pd(pd); > + nd_dev = to_nd_dev(pd->device); > + > + if (attrs->qp_type != IB_QPT_RC) { > + hvnd_error("attrs->qp_type=%d not IB_QPT_RC\n", attrs->qp_type); > + return ERR_PTR(-EINVAL); > + } > + > + qp = kzalloc(sizeof(*qp), GFP_KERNEL); > + if (!qp) { > + ret = -ENOMEM; > + goto err_done; > + } > + > + qp->uctx = uctx; > + > + if (ib_copy_from_udata(&ucmd, udata, sizeof(ucmd))) { > + hvnd_error("ib_copy_from_udata failed\n"); > + ret = -EFAULT; > + goto err_ucpy; > + } > + > + qp->qp_buf = (void *)ucmd.buf_addr; > + qp->db_addr = (void *)ucmd.db_addr; > + qp->log_sq_bb_count = ucmd.log_sq_bb_count; > + qp->log_sq_stride = ucmd.log_sq_stride; > + qp->sq_no_prefetch = ucmd.sq_no_prefetch; > + qp->port = attrs->port_num; > + > + init_waitqueue_head(&qp->wait); > + atomic_set(&qp->refcnt, 1); > + > + qp->recv_cq = recv_cq; > + qp->send_cq = send_cq; > + qp->nd_dev = nd_dev; > + > + qp->receive_cq_handle = recv_cq->cq_handle; > + qp->initiator_cq_handle = send_cq->cq_handle; > + qp->pd_handle = hvnd_pd->handle; > + qp->cq_notify = false; > + > + qp->ibqp.qp_num = attrs->qp_type == IB_QPT_SMI ? 0 : 1; > + > + qp->max_inline_data = attrs->cap.max_inline_data; > + > + qp->initiator_q_depth = attrs->cap.max_send_wr; > + qp->initiator_request_sge = attrs->cap.max_send_sge; > + > + > + qp->receive_q_depth = attrs->cap.max_recv_wr; > + qp->receive_request_sge = attrs->cap.max_recv_sge; > + > + set_rq_size(nd_dev, &attrs->cap, qp); > + > + set_user_sq_size(nd_dev, qp, &ucmd); > + > + qp->umem = ib_umem_get(&uctx->ibucontext, ucmd.buf_addr, > + qp->buf_size, 0, 0); > + if (IS_ERR(qp->umem)) { > + ret = PTR_ERR(qp->umem); > + hvnd_error("ib_umem_get failed ret=%d\n", ret); > + goto err_ucpy; > + } > + > + ret = hvnd_db_map_user(uctx, ucmd.db_addr, &qp->db_umem); > + > + if (ret) { > + hvnd_error("hvnd_db_map_user failed ret=%d\n", ret); > + goto err_db_map; > + } > + > + ret = hvnd_create_qp(nd_dev, uctx, qp); > + > + if (ret) { > + hvnd_error("hvnd_create_qp failed ret=%d\n", ret); > + goto err_qp; > + } > + > + hvnd_acquire_uctx_ref(uctx); > + > + qp->ibqp.qp_num = qp->qpn; > + qp->ibqp.qp_type = IB_QPT_RC; > + > + > + return &qp->ibqp; > + > +err_qp: > + hvnd_db_unmap_user(uctx, ucmd.db_addr); > + > +err_db_map: > + ib_umem_release(qp->umem); > + > +err_ucpy: > + kfree(qp); > +err_done: > + return ERR_PTR(ret); > +} > + > +static int hvnd_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, > + int attr_mask, struct ib_udata *udata) > +{ > + struct hvnd_qp *qp = to_nd_qp(ibqp); > + struct hvnd_dev *nd_dev = to_nd_dev(ibqp->device); > + enum ib_qp_state cur_state, new_state; > + int ret = 0; > + > + if (attr != NULL) { > + cur_state = (attr_mask & IB_QP_CUR_STATE) > + ? attr->cur_qp_state : qp->qp_state; > + new_state = (attr_mask & IB_QP_STATE) > + ? attr->qp_state : cur_state; > + > + hvnd_debug("qp->qp_state is %d new state is %d\n", > + qp->qp_state, new_state); > + hvnd_debug("current qp state is %d\n", cur_state); > + if (attr_mask & IB_QP_STATE) { > + /* Ensure the state is valid */ > + if (attr->qp_state < 0 || attr->qp_state > IB_QPS_ERR) { > + hvnd_error("incorrect qp state attr->qp_state=%d\n", > + attr->qp_state); > + return -EINVAL; > + } > + > + if (qp->qp_state != new_state) { > + qp->qp_state = new_state; > + /* > + * The only state transition supported is the > + * transition to error state. > + */ > + switch (new_state) { > + case IB_QPS_ERR: > + case IB_QPS_SQD: > + ret = hvnd_flush_qp(nd_dev, qp->uctx, > + qp); > + > + if (ret) > + hvnd_error("hvnd_flush_qp failed ret=%d\n", ret); > + > + // immediately notify the upper layer > + // on disconnection > + if (!ret && qp->connector) > + hvnd_process_notify_disconnect( > + qp->connector, > + STATUS_SUCCESS); > + > + return ret; > + > + default: > + break; > + } > + } > + } > + } > + return 0; > +} > + > + > +static int hvnd_ib_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, > + int attr_mask, struct ib_qp_init_attr *init_attr) > +{ > + struct hvnd_qp *qp = to_nd_qp(ibqp); > + > + memset(attr, 0, sizeof(*attr)); > + memset(init_attr, 0, sizeof(*init_attr)); > + > + attr->qp_state = qp->qp_state; > + > + init_attr->cap.max_send_wr = qp->max_send_wr; > + init_attr->cap.max_recv_wr = qp->max_recv_wr; > + > + init_attr->cap.max_send_sge = qp->max_send_sge; > + init_attr->cap.max_recv_sge = qp->max_recv_sge; > + init_attr->cap.max_inline_data = qp->max_inline_data; > + > + init_attr->sq_sig_type = IB_SIGNAL_ALL_WR; > + > + return 0; > +} > + > +static void hvnd_refuse_connection(struct hvnd_ep_obj *connector, int status); > +static int hvnd_destroy_qp(struct ib_qp *ib_qp) > +{ > + int ret; > + struct hvnd_qp *qp = to_nd_qp(ib_qp); > + struct hvnd_dev *nd_dev = to_nd_dev(ib_qp->device); > + u64 jiffies; > + > + if (!qp->connector) { > + hvnd_warn("error: connector is NULL; skip destroying connector\n"); > + goto free_qp; > + } > + > + /* > + * should we flush the qp first on ctrl-C?, > + * no need to disconnect on abrupt shutdown? > + */ > + if (qp->qp_state != IB_QPS_ERR && qp->qp_state != IB_QPS_SQD) { > + hvnd_warn("qp_state=%d, doing abrupt disconnect\n", > + qp->qp_state); > + hvnd_flush_qp(nd_dev, qp->uctx, qp); > + > + ep_stop(qp->connector); > + > + // now no pending activity is possible on the connector > + > + switch (qp->connector->cm_state) { > + > + case hvnd_cm_idle: > + case hvnd_cm_connect_reply_refused: > + case hvnd_cm_connect_request_sent: > + case hvnd_cm_close_sent: > + hvnd_warn("cm_state = %d not doing anything\n", > + qp->connector->cm_state); > + break; > + > + case hvnd_cm_connect_received: > + hvnd_warn("cm_state = %d refusing pending connection request\n", > + qp->connector->cm_state); > + hvnd_refuse_connection(qp->connector, -ECONNREFUSED); > + break; > + > + case hvnd_cm_connect_reply_sent: > + case hvnd_cm_established_sent: > + case hvnd_cm_accept_sent: > + hvnd_warn("cm_state = %d notifying disconnect on existing connection\n", > + qp->connector->cm_state); > + hvnd_process_notify_disconnect(qp->connector, > + STATUS_CANCELLED); > + break; > + > + default: > + hvnd_error("unknown cm_state = %d\n", > + qp->connector->cm_state); > + > + } > + goto free_connector; > + } else { > + hvnd_debug("qp_state=%d, doing normal disconnect\n", > + qp->qp_state); > + } > + > + if (!ep_add_work_pending(qp->connector)) > + goto free_connector; > + > + init_completion(&qp->connector->disconnect_event); > + > + /* > + * First issue a disconnect on the connector. > + */ > + > + hvnd_debug("calling hvnd_connector_disconnect\n"); > + ret = hvnd_connector_disconnect(nd_dev, qp->uctx, > + qp->connector->ep_handle, > + qp->connector); > + if (ret) { > + ep_del_work_pending(qp->connector); > + hvnd_error("disconnect: retval is %d\n", ret); > + ep_stop(qp->connector); > + goto free_connector; > + } > + /* > + * Now wait for the disconnect. > + */ > + jiffies = get_jiffies_64(); > + wait_for_completion(&qp->connector->disconnect_event); > + hvnd_debug("Completed disconnect connector=%p jiffies=%llu\n", > + qp->connector, get_jiffies_64() - jiffies); > + > + /* > + * Now free up the connector and drop the reference on uctx. > + */ > + > + ep_stop(qp->connector); > + > +free_connector: > + hvnd_debug("destroying connector handle: %p\n", > + (void *) qp->connector->ep_handle); > + hvnd_free_handle(nd_dev, qp->uctx, > + qp->connector->ep_handle, > + IOCTL_ND_CONNECTOR_FREE); > + > + hvnd_drop_uctx_ref(nd_dev, qp->uctx); > + hvnd_destroy_ep(qp->connector); > + qp->connector = NULL; > +free_qp: > + atomic_dec(&qp->refcnt); > + hvnd_debug("Waiting for the ref cnt to go to 0\n"); > + > + wait_event(qp->wait, !atomic_read(&qp->refcnt)); > + > + hvnd_debug("About to destroy qp\n"); > + hvnd_db_unmap_user(qp->uctx, (u64)qp->db_addr); > + ib_umem_release(qp->umem); > + > + hvnd_debug("About to free qp\n"); > + ret = hvnd_free_qp(nd_dev, qp->uctx, qp); > + > + if (ret == 0) { > + hvnd_drop_uctx_ref(nd_dev, qp->uctx); > + kfree(qp); > + } else { > + hvnd_error("free qp failed: ret is %d\n", ret); > + } > + > + return ret; > +} > + > +static struct ib_cq *hvnd_ib_create_cq(struct ib_device *ibdev, > + const struct ib_cq_init_attr *attr, > + struct ib_ucontext *ib_context, > + struct ib_udata *udata) > +{ > + struct hvnd_ucontext *uctx; > + struct hvnd_dev *nd_dev; > + struct mlx4_ib_create_cq ucmd; > + struct hvnd_cq *cq; > + int ret = 0; > + int entries = attr->cqe; > + > + uctx = to_nd_context(ib_context); > + nd_dev = to_nd_dev(ibdev); > + > + if (entries < 1 || entries > uctx->max_cqe) { > + hvnd_error("incorrct entries=%d\n", entries); > + ret = -EINVAL; > + goto err_done; > + } > + > + cq = kzalloc(sizeof(*cq), GFP_KERNEL); > + if (!cq) { > + ret = -ENOMEM; > + goto err_done; > + } > + > + entries = roundup_pow_of_two(entries + 1); > + cq->ibcq.cqe = entries - 1; > + cq->entries = entries; > + cq->uctx = uctx; > + > + if (ib_copy_from_udata(&ucmd, udata, sizeof(ucmd))) { > + hvnd_error("ib_copy_from_udata failed\n"); > + ret = -EFAULT; > + goto err_ucpy; > + } > + > + cq->cq_buf = (void *)ucmd.buf_addr; > + cq->db_addr = (void *)ucmd.db_addr; > + cq->arm_sn = 0; > + > + /* > + * Initialize the IRP state. Need to have a separate irp state > + * for CQ; for now share it with Listener/connector. > + */ > + ret = hvnd_init_ep(&cq->ep_object, NULL, ND_CQ, nd_dev, uctx); > + > + if (ret) { > + hvnd_error("hvnd_init_ep failed ret=%d\n", ret); > + goto err_ucpy; > + } > + > + cq->ep_object.cq = cq; > + cq->monitor = true; > + > + cq->umem = ib_umem_get(ib_context, ucmd.buf_addr, > + (entries * uctx->cqe_size), > + IB_ACCESS_LOCAL_WRITE, 1); > + if (IS_ERR(cq->umem)) { > + ret = IS_ERR(cq->umem); > + hvnd_error("ib_umem_get failed ret=%d\n", ret); > + goto err_ucpy; > + } > + > + ret = hvnd_db_map_user(uctx, ucmd.db_addr, &cq->db_umem); > + > + if (ret) { > + hvnd_error("hvnd_db_map_user failed ret=%d\n", ret); > + goto err_db_map; > + } > + > + ret = hvnd_create_cq(nd_dev, uctx, cq); > + > + if (ret) { > + hvnd_error("hvnd_create_cq failed ret=%d\n", ret); > + goto err_cq; > + } > + > + cq->ep_object.ep_handle = cq->cq_handle; > + > + if (ib_copy_to_udata(udata, &cq->cqn, sizeof(__u32))) { > + hvnd_error("ib_copy_to_udata failed\n"); > + ret = -EFAULT; > + goto err_ucpy_out; > + } > + > + if (!disable_cq_notify) { > + > + if (!ep_add_work_pending(&cq->ep_object)) > + goto err_ucpy_out; > + > + ret = hvnd_notify_cq(nd_dev, cq, ND_CQ_NOTIFY_ANY, > + (u64)&cq->ep_object); > + > + if (ret) { > + ep_del_work_pending(&cq->ep_object); > + hvnd_error("hvnd_notify_cq failed ret=%d\n", ret); > + goto err_ucpy_out; > + } > + } > + > + hvnd_acquire_uctx_ref(uctx); > + > + return &cq->ibcq; > + > +err_ucpy_out: > + hvnd_destroy_cq(nd_dev, cq); > + > +err_cq: > + hvnd_db_unmap_user(uctx, ucmd.db_addr); > + > +err_db_map: > + ib_umem_release(cq->umem); > + > +err_ucpy: > + kfree(cq); > +err_done: > + return ERR_PTR(ret); > +} > + > +static struct ib_qp *hvnd_get_qp(struct ib_device *dev, int qpn) > +{ > + struct hvnd_dev *nd_dev; > + struct hvnd_qp *qp = NULL; > + > + nd_dev = to_nd_dev(dev); > + qp = get_qpp(nd_dev, qpn); > + return qp ? &qp->ibqp : NULL; > +} > + > +static int hvnd_ib_destroy_cq(struct ib_cq *ib_cq) > +{ > + struct hvnd_ucontext *uctx; > + struct hvnd_dev *nd_dev; > + struct hvnd_cq *cq; > + > + cq = to_nd_cq(ib_cq); > + uctx = cq->uctx; > + nd_dev = to_nd_dev(uctx->ibucontext.device); > + > + cq->monitor = false; > + > +// hvnd_cancel_io(&cq->ep_object); > + ep_stop(&cq->ep_object); > + > + hvnd_deinit_ep(&cq->ep_object); > + > + hvnd_db_unmap_user(uctx, (u64)cq->db_addr); > + ib_umem_release(cq->umem); > + > + hvnd_destroy_cq(nd_dev, cq); > + > + hvnd_drop_uctx_ref(nd_dev, uctx); > + kfree(cq); > + > + return 0; > +} > + > +static int hvnd_resize_cq(struct ib_cq *cq, int cqe, struct ib_udata *udata) > +{ > + /* > + * NDDirect does not support resizing CQ. > + */ > + hvnd_info("check code\n"); > + return -ENOSYS; > +} > + > +static int hvnd_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc) > +{ > + hvnd_info("check code\n"); > + return 0; > +} > + > +static struct ib_mr *hvnd_get_dma_mr(struct ib_pd *pd, int acc) > +{ > + hvnd_info("check code\n"); > + return NULL; > +} > + > +static void debug_dump_umem(struct ib_umem *umem) > +{ > +#ifdef HVND_MEM_DEBUG > + struct ib_umem_chunk *chunk; > + struct scatterlist *sg; > + int len, j, entry; > + int shift = ffs(umem->page_size) - 1; > + > + hvnd_debug("umem=%p\n", umem); > + hvnd_debug("context=%p length=%lu offset=%d page_size=%d writable=%d hugetlb=%d\n", > + umem->context, > + umem->length, > + umem->offset, > + umem->page_size, > + umem->writable, > + umem->hugetlb); > + > + list_for_each_entry(chunk, &umem->chunk_list, list) { > + hvnd_debug("chunk->nmap=%d\n", chunk->nmap); > + for (j = 0; j < chunk->nmap; ++j) { > + sg = &chunk->page_list[j]; > + hvnd_debug("sg_dma_len=%d sg_dma_address=%llx\n", > + sg_dma_len(sg), sg_dma_address(sg)); > + hvnd_debug("page_link=%lx offset=%u length=%u\n", > + sg->page_link, sg->offset, sg->length); > + len = sg_dma_len(&chunk->page_list[j]) >> shift; > + for_each_sg(&chunk->page_list[j], sg, len, entry) { > + hvnd_debug("PFN=%lu\n", > + page_to_pfn(sg_page(sg))); > + } > + } > + } > +#endif > +} > + > + > +static struct ib_mr *hvnd_reg_user_mr(struct ib_pd *pd, u64 start, u64 length, > + u64 virt, int acc, struct ib_udata *udata) > +{ > + int err = 0; > + struct hvnd_ib_pd *hvndpd = to_nd_pd(pd); > + struct hvnd_mr *mr; > + > + mr = kmalloc(sizeof(*mr), GFP_KERNEL); > + if (!mr) > + return ERR_PTR(-ENOMEM); > + mr->pd = hvndpd; > + > + mr->umem = ib_umem_get(pd->uobject->context, start, length, acc, 0); > + if (IS_ERR(mr->umem)) { > + err = PTR_ERR(mr->umem); > + hvnd_error("ib_umem_get failed ret=%d\n", err); > + kfree(mr); > + return ERR_PTR(err); > + } > + > + debug_dump_umem(mr->umem); > + > + mr->start = start; > + mr->length = length; > + mr->virt = virt; > + mr->acc = acc; > + > + hvnd_debug("start=%llx length=%llx virt=%llx acc=%d\n", > + start, length, virt, acc); > + > + /* > + * First create a memory region. > + */ > + err = hvnd_cr_mr(to_nd_dev(pd->device), > + to_nd_context(pd->uobject->context), hvndpd->handle, > + &mr->mr_handle); > + if (err) { > + hvnd_error("cr_mr failed; ret is %d\n", err); > + goto err; > + } > + > + err = hvnd_mr_register(to_nd_dev(pd->device), > + to_nd_context(pd->uobject->context), mr); > + > + if (err) > + goto err0; > + > + hvnd_acquire_uctx_ref(to_nd_context(pd->uobject->context)); > + > + return &mr->ibmr; > + > +err0: > + hvnd_free_mr(to_nd_dev(pd->device), > + to_nd_context(pd->uobject->context), mr->mr_handle); > +err: > + ib_umem_release(mr->umem); > + kfree(mr); > + return ERR_PTR(err); > +} > + > + > + > +static int hvnd_dereg_mr(struct ib_mr *ib_mr) > +{ > + int ret; > + struct hvnd_mr *mr = to_nd_mr(ib_mr); > + struct hvnd_ucontext *uctx = to_nd_context(ib_mr->pd->uobject->context); > + struct hvnd_dev *nd_dev = to_nd_dev(ib_mr->device); > + > + > + hvnd_debug("dereg_mr entering\n"); > + > + ret = hvnd_deregister_mr(nd_dev, uctx, mr->mr_handle); > + > + if (ret) { > + hvnd_error("hvnd_deregister_mr() failed: %x\n", ret); > + return ret; > + } > + /* > + * Now free up the memory region. > + */ > + > + ret = hvnd_free_mr(nd_dev, uctx, mr->mr_handle); > + if (ret) { > + hvnd_error("hvnd_free_mr() failed: %x\n", ret); > + return ret; > + } > + > + ib_umem_release(mr->umem); > + > + hvnd_drop_uctx_ref(nd_dev, uctx); > + kfree(mr); > + > + hvnd_debug("dereg_mr done\n"); > + return 0; > +} > + > +static struct ib_mw *hvnd_alloc_mw(struct ib_pd *pd, enum ib_mw_type type, > + struct ib_udata *udata) > +{ > + hvnd_info("check code\n"); > + return NULL; > +} > + > +static int hvnd_dealloc_mw(struct ib_mw *mw) > +{ > + debug_check(__func__, __LINE__); > + return 0; > +} > + > +static int hvnd_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags) > +{ > + struct hvnd_ucontext *uctx; > + struct hvnd_dev *nd_dev; > + struct hvnd_cq *cq; > + > + cq = to_nd_cq(ibcq); > + uctx = cq->uctx; > + nd_dev = to_nd_dev(uctx->ibucontext.device); > + > + > + debug_check(__func__, __LINE__); > + > + return 0; > +} > + > +static int hvnd_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, > + struct ib_send_wr **bad_wr) > +{ > + debug_check(__func__, __LINE__); > + return 0; > +} > + > +int hvnd_post_receive(struct ib_qp *ibqp, struct ib_recv_wr *wr, > + struct ib_recv_wr **bad_wr) > +{ > + debug_check(__func__, __LINE__); > + return 0; > +} > + > +static int hvnd_resolve_addr(struct sockaddr_in *laddr, > + struct sockaddr_in *raddr, > + struct if_physical_addr *phys_addrstruct) > +{ > + int ret; > + > + phys_addrstruct->length = ETH_ALEN; > + ret = hvnd_get_neigh_mac_addr((struct sockaddr *)laddr, > + (struct sockaddr *)raddr, > + phys_addrstruct->addr); > + > + hvnd_debug("Dest MAC is %pM\n", phys_addrstruct->addr); > + return ret; > +} > + > +static int hvnd_connect(struct iw_cm_id *cm_id, > + struct iw_cm_conn_param *conn_param) > +{ > + int ret = 0; > + struct hvnd_dev *nd_dev; > + struct hvnd_ep_obj *ep_object; > + struct sockaddr_in *raddr = (struct sockaddr_in *)&cm_id->remote_addr; > + struct sockaddr_in *laddr = (struct sockaddr_in *)&cm_id->local_addr; > + struct hvnd_qp *qp; > + struct if_physical_addr phys_addrstruct; > + union nd_sockaddr_inet dest_addr; > + u64 connector_handle; > + union nd_sockaddr_inet addr; > + char addr_buf[50]; > + > + if (cm_id->remote_addr.ss_family != AF_INET) { > + hvnd_error("cm_id->remote_addr.ss_family=%d not AF_INET\n", > + cm_id->remote_addr.ss_family); > + return -ENOSYS; > + } > + > + qp = get_qpp(to_nd_dev(cm_id->device), conn_param->qpn); > + > + if (!qp) { > + hvnd_error("failed to find qp conn_param->qpn=%d\n", > + conn_param->qpn); > + return -EINVAL; > + } > + > + cm_id->provider_data = qp; > + cm_id->add_ref(cm_id); > + qp->cm_id = cm_id; > + > + /* > + * Set the read/write limits. > + * Can we change the limits on a created QP? Luke? > + */ > + nd_dev = to_nd_dev(cm_id->device); > + ep_object = hvnd_setup_ep(cm_id, ND_CONNECTOR, nd_dev, qp->uctx); > + hvnd_debug("active connection: local irp is %d\n", > + ep_object->local_irp); > + if (!ep_object) { > + hvnd_error("hvnd_setup_ep failure\n"); > + ret = -ENOMEM; > + goto err_limit; > + } > + > + ret = hvnd_cr_connector(nd_dev, qp->uctx, > + &connector_handle); > + > + if (ret) { > + hvnd_error("hvnd_cr_connector failure ret=%d\n", ret); > + goto err_cr_connector; > + } > + > + hvnd_acquire_uctx_ref(qp->uctx); > + ep_object->ep_handle = connector_handle; > + ep_object->incoming = false; > + qp->connector = ep_object; > + > + /* > + * Bind the local address to the connector. > + */ > + hvnd_debug("Connect local address is %s\n", > + debug_inet_ntoa(laddr->sin_addr, addr_buf)); > + > + memcpy(&addr.ipv4, laddr, sizeof(struct sockaddr_in)); > + hvnd_debug("CONNECT AF %d port %d addr %s\n", > + addr.ipv4.sin_family, addr.ipv4.sin_port, > + debug_inet_ntoa(addr.ipv4.sin_addr, addr_buf)); > + > + ret = hvnd_bind_connector(nd_dev, qp->uctx, > + connector_handle, > + &addr); > + > + if (ret) { > + hvnd_error("hvnd_bind_connector failed ret=%d\n", ret); > + goto err_bind_connector; > + } > + > + ret = hvnd_resolve_addr(laddr, raddr, &phys_addrstruct); > + if (ret) { > + hvnd_error("hvnd_resolve_addr failed ret=%d\n", ret); > + goto err_bind_connector; > + } > + > + memcpy(&dest_addr.ipv4, raddr, sizeof(struct sockaddr_in)); > + > + > + > + /* > + * Now attempt to connect. > + */ > + > + hvnd_debug("About to initiate connection\n"); > + > + if (!ep_add_work_pending(ep_object)) > + goto err_bind_connector; > + > + ep_object->cm_state = hvnd_cm_connect_received; > + ret = hvnd_connector_connect(nd_dev, qp->uctx, > + ep_object->ep_handle, > + conn_param->ird, conn_param->ord, > + conn_param->private_data_len, > + (u8 *)conn_param->private_data, > + qp->qp_handle, > + &phys_addrstruct, &dest_addr, > + ep_object); > + > + if (ret == 0) > + return 0; > + > + ep_object->cm_state = hvnd_cm_idle; > + ep_del_work_pending(ep_object); > + hvnd_error("hvnd_connector_connect failed ret=%d\n", ret); > + > +err_bind_connector: > + qp->connector = NULL; > + hvnd_free_connector(nd_dev, qp->uctx, > + connector_handle); > + hvnd_drop_uctx_ref(nd_dev, qp->uctx); > + > +err_cr_connector: > + kfree(ep_object); > + > +err_limit: > + cm_id->provider_data = NULL; > + qp->cm_id = NULL; > + cm_id->rem_ref(cm_id); > + return ret; > +} > + > +static int hvnd_accept_cr(struct iw_cm_id *cm_id, > + struct iw_cm_conn_param *conn_param) > +{ > + int ret = 0; > + struct hvnd_dev *nd_dev; > + struct hvnd_qp *qp; > + struct hvnd_ep_obj *connector; > + enum ibv_qp_state new_qp_state; > + > + hvnd_debug("Accepting connection - PASSIVE\n"); > + nd_dev = to_nd_dev(cm_id->device); > + qp = get_qpp(to_nd_dev(cm_id->device), conn_param->qpn); > + > + if (!qp) { > + hvnd_error("get_qpp failed conn_param->qpn=%d\n", > + conn_param->qpn); > + return -EINVAL; > + } > + > + > + connector = (struct hvnd_ep_obj *)cm_id->provider_data; > + qp->connector = connector; > + > + if (connector == NULL) { > + hvnd_error("NULL connector!\n"); > + return -EINVAL; > + } > + hvnd_debug("connector's cm_id is %p caller cm_id=%p\n", > + connector->cm_id, cm_id); > + > + connector->cq = qp->recv_cq; > + > + > + /* > + * Setup state for the accepted connection. > + */ > + cm_id->add_ref(cm_id); > + connector->cm_id = cm_id; > + if (conn_param == NULL) { > + hvnd_error("NULL conn_param!\n"); > + return -EINVAL; > + } > + > + connector->ord = conn_param->ord; > + connector->ird = conn_param->ird; > + > + if (!ep_add_work_pending(connector)) > + goto error; > + > + init_completion(&connector->connector_accept_event); > + > + ret = hvnd_connector_accept(nd_dev, qp->uctx, connector->ep_handle, > + qp->qp_handle, conn_param->ird, > + conn_param->ord, > + conn_param->private_data_len, > + conn_param->private_data, > + &new_qp_state, connector); > + > + if (ret) { > + ep_del_work_pending(connector); > + hvnd_error("connector accept failed\n"); > + goto error; > + } > + > + wait_for_completion(&connector->connector_accept_event); > + ret = connector->connector_accept_status; > + > + if (ret) { > + hvnd_error("connector_accept failed status=%x\n", ret); > + ret = -EIO; > + goto error; > + } > + > + hvnd_debug("Passive Connection Accepted; new qp state is %d\n", > + new_qp_state); > + connector->cm_state = hvnd_cm_accept_sent; > + return 0; > + > +error: > + ep_stop(connector); > + connector->cm_id = NULL; > + connector->cm_state = hvnd_cm_idle; > + > + qp->connector = NULL; > + cm_id->rem_ref(cm_id); > + > + return ret; > +} > + > +static int hvnd_reject_cr(struct iw_cm_id *cm_id, const void *pdata, > + u8 pdata_len) > +{ > + debug_check(__func__, __LINE__); > + return 0; > +} > + > +void hvnd_process_disconnect(struct hvnd_ep_obj *ep_object, int status) > +{ > + struct iw_cm_event cm_event; > + > + switch (status) { > + case STATUS_SUCCESS: > + case STATUS_CANCELLED: > + break; > + > + default: > + hvnd_warn("disconnect complete failed: status:%d\n", status); > + } > + > + > + hvnd_debug("active disconnect processed\n"); > + memset(&cm_event, 0, sizeof(cm_event)); > + > + complete(&ep_object->disconnect_event); > +} > + > + > +void hvnd_process_notify_disconnect(struct hvnd_ep_obj *ep_object, int status) > +{ > + struct iw_cm_event cm_event; > + > + // make sure we only disconnect once > + if (atomic_xchg(&ep_object->disconnect_notified, 1)) > + return; > + > + /* > + * Turn off CQ monitoring. > + */ > + if (ep_object->cq) > + ep_object->cq->monitor = false; > + > + switch (ep_object->cm_state) { > + case hvnd_cm_connect_reply_sent: > + case hvnd_cm_established_sent: > + case hvnd_cm_accept_sent: > + break; > + > + default: > + hvnd_error("unexpected cm_state=%d\n", ep_object->cm_state); > + return; > + } > + > + switch (status) { > + case STATUS_SUCCESS: > + case STATUS_CANCELLED: > + case STATUS_DISCONNECTED: > + break; > + > + default: > + hvnd_warn("notify disconnect complete failed: status:%d\n", > + status); > + } > + > + hvnd_debug("passive disconnect notified\n"); > + memset(&cm_event, 0, sizeof(cm_event)); > + > + /* > + * Other end disconnected. > + * Connection has been disconnected; > + * notify the cm layer. > + */ > + cm_event.status = -ECONNRESET; > + cm_event.event = IW_CM_EVENT_CLOSE; > + > + if ((ep_object->cm_id) && > + (ep_object->cm_id->event_handler)) { > + > + ep_object->cm_id->event_handler(ep_object->cm_id, &cm_event); > + > + ep_object->cm_id->rem_ref(ep_object->cm_id); > + ep_object->cm_state = hvnd_cm_close_sent; > + } > +} > + > +void hvnd_process_connector_accept(struct hvnd_ep_obj *ep_object, int status) > +{ > + struct iw_cm_event cm_event; > + int ret; > + > + /* > + * This is the problem area the return status may be: > + * 1: 0xc00000b5 (3221225653) - {Device Timeout} The specified I/O > + * operation on %hs was not completed before the time-out period > + * expired > + * 2: NTSTATUS 0xc0000241 (3221226049) - The transport connection was > + * aborted by the local system. > + * If we do nothing here, iwcm will wait for IW_CM_EVENT_ESTABLISHED > + * forever, and unable to clean shutdown need to fail the call earlier > + * on accept. > + */ > + > + ep_object->connector_accept_status = status; > + > + if (status) { > + hvnd_error("Connector accept failed; status is %x\n", status); > + complete(&ep_object->connector_accept_event); > + return; > + } > + > + memset(&cm_event, 0, sizeof(cm_event)); > + cm_event.event = IW_CM_EVENT_ESTABLISHED; > + cm_event.ird = ep_object->ird; > + cm_event.ord = ep_object->ord; > + cm_event.provider_data = (void *)ep_object; > + > + /* > + * We have successfully passively accepted the > + * incoming connection. > + */ > + > + hvnd_debug("Passive connection accepted!!\n"); > + if ((ep_object->cm_id) && > + (ep_object->cm_id->event_handler)) { > + ep_object->cm_id->event_handler(ep_object->cm_id, &cm_event); > + ep_object->cm_state = hvnd_cm_established_sent; > + } > + > + complete(&ep_object->connector_accept_event); > + > + /* > + * Request notification if the other end > + * were to disconnect. > + */ > + if (!ep_add_work_pending(ep_object)) > + return; > + > + ret = hvnd_connector_notify_disconnect(ep_object->nd_dev, > + ep_object->uctx, > + ep_object->ep_handle, > + ep_object); > + > + if (ret) { > + ep_del_work_pending(ep_object); > + hvnd_error("Connector notify disconnect failed; ret: %d\n", > + ret); > + } > +} > + > + > +void hvnd_process_cq_event_pending(struct hvnd_ep_obj *ep_object, > + int status) > +{ > + > + struct ib_cq *ibcq; > + struct hvnd_cq *cq; > + > + cq = ep_object->cq; > + ibcq = &ep_object->cq->ibcq; > + > + if (!cq->monitor) > + return; > + > + // call the previous CQ complete > + if ((status == STATUS_PENDING) && cq->upcall_pending && > + ibcq->comp_handler) { > + ibcq->comp_handler(ibcq, ibcq->cq_context); > + cq->upcall_pending = false; > + hvnd_debug("CQ comp_handler called arm_sn=%d\n", cq->arm_sn); > + } > + > + if ((status != STATUS_PENDING) && ibcq->comp_handler && > + ibcq->cq_context) { > + ibcq->comp_handler(ibcq, ibcq->cq_context); > + hvnd_error("CQ comp_handler called status=%x\n", status); > + } > +} > + > +void hvnd_process_cq_event_complete(struct hvnd_ep_obj *ep_object, > + int status) > +{ > + struct ib_cq *ibcq; > + struct hvnd_cq *cq; > + int ret; > + > + cq = ep_object->cq; > + ibcq = &ep_object->cq->ibcq; > + > + // call hte previous CQ complete > + if (cq->upcall_pending && ibcq->comp_handler) { > + ibcq->comp_handler(ibcq, ibcq->cq_context); > + cq->upcall_pending = false; > + hvnd_debug("CQ comp_handler called arm_sn=%d\n", cq->arm_sn); > + } > + > + cq->upcall_pending = true; > + if (!ep_add_work_pending(ep_object)) > + return; > + > + ret = hvnd_notify_cq(ep_object->nd_dev, > + ep_object->cq, > + ND_CQ_NOTIFY_ANY, > + (u64)ep_object); > + > + if (ret) { > + ep_del_work_pending(ep_object); > +// hvnd_manage_io_state(ep_object, true); > + hvnd_error("hvnd_notify_cq failed ret=%d\n", ret); > + } > + > + if ((status != 0) && (status != STATUS_CANCELLED)) { > + if (ibcq->event_handler) { > + struct ib_event event; > + > + event.device = ibcq->device; > + event.event = IB_EVENT_CQ_ERR; > + event.element.cq = ibcq; > + ibcq->event_handler(&event, ibcq->cq_context); > + > + hvnd_warn("CQ event_handler called status=%x\n", > + status); > + } > + } > +} > + > +int init_cm_event(struct hvnd_ep_obj *ep_object, struct iw_cm_event *cm_event, > + int event) > +{ > + struct sockaddr_in *laddr = (struct sockaddr_in *)&cm_event->local_addr; > + struct sockaddr_in *raddr = (struct sockaddr_in *)&cm_event->remote_addr; > + struct nd_read_limits rd_limits; > + union nd_sockaddr_inet local_addr; > + union nd_sockaddr_inet remote_addr; > + int ret; > + > + /* > + * Now get the local address. > + */ > + ret = hvnd_connector_get_local_addr(ep_object->nd_dev, > + ep_object->uctx, > + ep_object->ep_handle, > + &local_addr); > + > + if (ret) { > + hvnd_error("Connector get addr failed; ret: %d\n", ret); > + return ret; > + } > + /* > + * Now get the remote address. > + */ > + ret = hvnd_connector_get_peer_addr(ep_object->nd_dev, > + ep_object->uctx, > + ep_object->ep_handle, > + &remote_addr); > + > + if (ret) { > + hvnd_error("Connector get peer addr failed; ret: %d\n", ret); > + return ret; > + } > + > + /* > + * Get other connection parameters. > + */ > + > + ret = hvnd_connector_get_rd_limits(ep_object->nd_dev, > + ep_object->uctx, > + ep_object->ep_handle, > + &rd_limits); > + > + if (ret) { > + hvnd_error("Connector rd limits failed; ret: %d\n", ret); > + return ret; > + } > + > + /* > + * XXXKYS: Luke: What about the length of the priv data? > + */ > + ret = hvnd_connector_get_priv_data(ep_object->nd_dev, > + ep_object->uctx, > + ep_object->ep_handle, > + ep_object->priv_data); > + > + if (ret) { > + hvnd_error("Connector get priv data failed; ret: %d\n", ret); > + return ret; > + } > + /* > + * Initialize CM structure. > + */ > + laddr->sin_addr.s_addr = local_addr.ipv4.sin_addr.s_addr; > + hvnd_debug("Local addr is %d\n", laddr->sin_addr.s_addr); > + laddr->sin_port = local_addr.ipv4.sin_port; > + laddr->sin_family = AF_INET; > + > + raddr->sin_addr.s_addr = remote_addr.ipv4.sin_addr.s_addr; > + hvnd_debug("Remote addr is %d\n", raddr->sin_addr.s_addr); > + raddr->sin_port = remote_addr.ipv4.sin_port; > + raddr->sin_family = AF_INET; > + > + // KYS; LUke: is it always 148 bytes? > + cm_event->private_data_len = MAX_PRIVATE_DATA_LEN; > + cm_event->private_data = ep_object->priv_data; > + > + cm_event->ird = rd_limits.inbound; > + cm_event->ord = rd_limits.outbound; > + cm_event->event = event; > + > + ep_object->ird = cm_event->ird; > + ep_object->ord = cm_event->ord; > + > + return 0; > +} > + > +static void hvnd_refuse_connection(struct hvnd_ep_obj *connector, int status) > +{ > + struct iw_cm_event cm_event; > + > + memset(&cm_event, 0, sizeof(cm_event)); > + > + cm_event.event = IW_CM_EVENT_CONNECT_REPLY; > + cm_event.status = status; > + > + hvnd_debug("returning status %d on connector %p\n", status, connector); > + > + if (connector->cm_id && connector->cm_id->event_handler) { > + connector->cm_id->event_handler(connector->cm_id, &cm_event); > + connector->cm_id->rem_ref(connector->cm_id); > + connector->cm_state = hvnd_cm_connect_reply_refused; > + } > +} > + > +void hvnd_process_events(struct work_struct *work) > +{ > + struct hvnd_work *wrk; > + struct nd_read_limits rd_limits; > + struct hvnd_ep_obj *ep_object; > + struct hvnd_ep_obj *parent; > + struct iw_cm_event cm_event; > + struct sockaddr_in *laddr = (struct sockaddr_in *)&cm_event.local_addr; > + struct sockaddr_in *raddr = (struct sockaddr_in *)&cm_event.remote_addr; > + struct ndv_packet_hdr_control_1 *ctrl_hdr; > + union nd_sockaddr_inet local_addr; > + union nd_sockaddr_inet remote_addr; > + struct pkt_nd_get_connection_listener *connection_pkt; > + struct iw_cm_id *cm_id = NULL; > + int status; > + int ioctl; > + int ret; > + char priv_data[MAX_PRIVATE_DATA_LEN]; > + enum ibv_qp_state new_qp_state; > + struct incoming_pkt *incoming_pkt; > + unsigned long flags; > + > + memset(&cm_event, 0, sizeof(cm_event)); > + memset(&priv_data, 0, MAX_PRIVATE_DATA_LEN); > + > + > + wrk = container_of(work, struct hvnd_work, work); > + > + /* > + * Now call into the connection manager. > + */ > + ep_object = (struct hvnd_ep_obj *)wrk->callback_arg; > + parent = ep_object->parent; > + > +process_next: > + incoming_pkt = NULL; > + spin_lock_irqsave(&ep_object->incoming_pkt_list_lock, flags); > + if (!list_empty(&ep_object->incoming_pkt_list)) { > + incoming_pkt = list_first_entry(&ep_object->incoming_pkt_list, > + struct incoming_pkt, > + list_entry); > + list_del(&incoming_pkt->list_entry); > + } > + spin_unlock_irqrestore(&ep_object->incoming_pkt_list_lock, flags); > + if (incoming_pkt == NULL) > + return; > + > + ctrl_hdr = (struct ndv_packet_hdr_control_1 *)incoming_pkt->pkt; > + status = ctrl_hdr->io_status; > + ioctl = ctrl_hdr->io_cntrl_code; > + > + hvnd_debug("Process Events IOCTL is: %s; iostatus failure: %x in work queue\n", > + hvnd_get_op_name(ioctl), status); > + > + if (status != 0) { > + bool log_error = true; > + > + if (ioctl == IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT && > + status == STATUS_DISCONNECTED) // expected > + log_error = false; > + > + if (log_error) > + hvnd_warn("Process Events IOCTL is: %s; iostatus failure: %x\n", > + hvnd_get_op_name(ioctl), status); > + } > + > + cm_event.status = status; > + > + switch (ep_object->type) { > + case ND_CONNECTOR: > + switch (ioctl) { > + case IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST: > + > + if (ep_object->parent != NULL) { > + > + // Do nothing with this connection request if > + // listener is stopping > + if (!ep_add_work_pending(ep_object->parent)) > + break; > + > + cm_id = ep_object->parent->cm_id; //Listener > + } > + > + connection_pkt = (struct pkt_nd_get_connection_listener *) ctrl_hdr; > + > + if ((status == 0) || (status == STATUS_CANCELLED)) { > + hvnd_get_incoming_connections(ep_object->parent, > + ep_object->parent->nd_dev, > + ep_object->uctx); > + } > + > + if (status) > + goto get_connection_request_done; > + > + /* > + * Now get the local address. > + */ > + ret = hvnd_connector_get_local_addr(ep_object->nd_dev, > + ep_object->uctx, > + ep_object->ep_handle, > + &local_addr); > + > + if (ret) { > + hvnd_error("Connector get addr failed; ret: %d\n", > + ret); > + goto get_connection_request_done; > + } > + /* > + * Now get the remote address. > + */ > + ret = hvnd_connector_get_peer_addr(ep_object->nd_dev, > + ep_object->uctx, > + ep_object->ep_handle, > + &remote_addr); > + > + if (ret) { > + hvnd_error("Connector get peer addr failed; ret: %d\n", > + ret); > + goto get_connection_request_done; > + } > + /* > + * Get other connection parameters. > + */ > + > + ret = hvnd_connector_get_rd_limits(ep_object->nd_dev, > + ep_object->uctx, > + ep_object->ep_handle, > + &rd_limits); > + > + if (ret) { > + hvnd_error("Connector rd imits failed; ret: %d\n", > + ret); > + goto get_connection_request_done; > + } > + > + /* > + * XXXKYS: Luke: What about the length of the priv data? > + */ > + ret = hvnd_connector_get_priv_data( > + ep_object->nd_dev, ep_object->uctx, > + ep_object->ep_handle, ep_object->priv_data); > + > + if (ret) { > + hvnd_error("Connector get priv data failed; ret: %d\n", > + ret); > + goto get_connection_request_done; > + } > + > + cm_event.event = IW_CM_EVENT_CONNECT_REQUEST; > + cm_event.provider_data = (void *)ep_object; > + > + laddr->sin_addr.s_addr = local_addr.ipv4.sin_addr.s_addr; > + hvnd_debug("Local addr is %d\n", > + laddr->sin_addr.s_addr); > + laddr->sin_port = local_addr.ipv4.sin_port; > + laddr->sin_family = AF_INET; > + > + raddr->sin_addr.s_addr = remote_addr.ipv4.sin_addr.s_addr; > + hvnd_debug("Remote addr is %d\n", > + raddr->sin_addr.s_addr); > + raddr->sin_port = remote_addr.ipv4.sin_port; > + raddr->sin_family = AF_INET; > + > + // KYS; LUke: is it always 148 bytes? > + cm_event.private_data_len = MAX_PRIVATE_DATA_LEN; > + cm_event.private_data = ep_object->priv_data; > + > + cm_event.ird = rd_limits.inbound; > + cm_event.ord = rd_limits.outbound; > + > + ep_object->ird = cm_event.ird; > + ep_object->ord = cm_event.ord; > + > + > + if ((cm_id != NULL) && cm_id->event_handler) { > + cm_id->event_handler(cm_id, &cm_event); > + ep_object->cm_state = hvnd_cm_connect_request_sent; > + } > + > +get_connection_request_done: > + if (ep_object->parent != NULL) > + ep_del_work_pending(ep_object->parent); > + break; > + > + case IOCTL_ND_CONNECTOR_CONNECT: > + > + cm_event.event = IW_CM_EVENT_CONNECT_REPLY; > + if (status == STATUS_TIMEOUT && > + ep_object->connector_connect_retry < 3) { > + //TIMEOUT retry > + > + if (!ep_add_work_pending(ep_object)) > + goto refuse_connection; > + > + hvnd_warn("Connector connect timed out, reconnecting... retry count: %d\n", > + ep_object->connector_connect_retry); > + ep_object->connector_connect_retry++; > + ret = hvnd_send_ioctl_pkt(ep_object->nd_dev, > + &ep_object->connector_connect_pkt.hdr, > + sizeof(ep_object->connector_connect_pkt), > + (u64)&ep_object->connector_connect_pkt); > + > + if (ret) { > + hvnd_error("Connector on time out failed: %d\n", > + ret); > + ep_del_work_pending(ep_object); > + goto refuse_connection; > + } > + break; > + } > + > +refuse_connection: > + if (status) { > + cm_event.status = -ECONNREFUSED; > + if (status == STATUS_TIMEOUT) > + cm_event.status = -ETIMEDOUT; > + > + hvnd_refuse_connection(ep_object, > + cm_event.status); > + break; > + } > + > + hvnd_debug("ACTIVE Connection ACCEPTED\n"); > + ret = init_cm_event(ep_object, &cm_event, > + IW_CM_EVENT_CONNECT_REPLY); > + if (ret) { > + hvnd_error("init_cm_event failed ret=%d\n", > + ret); > + goto process_done; > + } > + > + ret = hvnd_connector_complete_connect(ep_object->nd_dev, > + ep_object->uctx, > + ep_object->ep_handle, > + &new_qp_state); > + if (ret) { > + hvnd_error("connector_complete failed\n"); > + goto process_done; > + } > + > + cm_event.provider_data = (void *)ep_object; > + > + if ((ep_object->cm_id) && > + (ep_object->cm_id->event_handler)) { > + ep_object->cm_id->event_handler( > + ep_object->cm_id, &cm_event); > + ep_object->cm_state = hvnd_cm_connect_reply_sent; > + } > + /* > + * Rquest notification if the other end > + * were to disconnect. > + */ > + if (!ep_add_work_pending(ep_object)) > + goto process_done; > + > + ret = hvnd_connector_notify_disconnect( > + ep_object->nd_dev, ep_object->uctx, > + ep_object->ep_handle, ep_object); > + > + if (ret) { > + ep_del_work_pending(ep_object); > + hvnd_error("Connector notify disconnect failed; ret: %d\n", > + ret); > + } > + > + break; > + > + case IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT: > + hvnd_process_notify_disconnect(ep_object, status); > + break; > + > + > + default: > + hvnd_error("Unknown Connector IOCTL\n"); > + break; > + } > + break; > + default: > + hvnd_error("Unknown endpoint object\n"); > + break; > + } > +process_done: > + kfree(incoming_pkt); > + ep_del_work_pending(ep_object); > + > + goto process_next; > +} > + > + > +static struct hvnd_ep_obj *hvnd_setup_ep(struct iw_cm_id *cm_id, int ep_type, > + struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx) > +{ > + struct hvnd_ep_obj *ep_object; > + int ret; > + > + ep_object = kzalloc(sizeof(struct hvnd_ep_obj), GFP_KERNEL); > + > + if (!ep_object) > + return NULL; > + > + ret = hvnd_init_ep(ep_object, cm_id, ep_type, nd_dev, uctx); > + > + if (ret) { > + hvnd_error("hvnd_init_ep failed ret=%d\n", ret); > + kfree(ep_object); > + return NULL; > + } > + > + return ep_object; > +} > + > +/* > + * return value: > + * true: I/O state is stopped, we should not do upcall > + * flase: I/O state is running and normal > + */ > +#if 0 > +static bool hvnd_manage_io_state(struct hvnd_ep_obj *ep, bool failure) > +{ > + unsigned long flags; > + > + spin_lock_irqsave(&ep->ep_lk, flags); > + if (ep->to_be_destroyed) { > + hvnd_warn("ep being destroyed\n"); > + if (ep->io_outstanding) { > + hvnd_warn("ep being destroyed i/O pending waking up on %p\n", > + &ep->block_event); > + complete(&ep->block_event); > + ep->io_outstanding = false; > + } > + spin_unlock_irqrestore(&ep->ep_lk, flags); > + return true; > + } > + if (!failure) > + ep->io_outstanding = true; > + spin_unlock_irqrestore(&ep->ep_lk, flags); > + return false; > +} > +#endif > + > +static int hvnd_get_incoming_connections(struct hvnd_ep_obj *listener, > + struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx) > +{ > + struct hvnd_ep_obj *connector; > + u64 connector_handle; > + int ret; > + > + /* > + * First handle the protocol for > + * destruction - outstanding I/O. > + */ > + > +// if (hvnd_manage_io_state(listener, false)) > +// return 0; > + /* > + * Create a connector. > + */ > + connector = hvnd_setup_ep(listener->cm_id, ND_CONNECTOR, nd_dev, uctx); > + if (!connector) { > + hvnd_error("hvnd_setup_ep failed\n"); > + ret = -ENOMEM; > + goto con_alloc_err; > + } > + > + ret = hvnd_cr_connector(nd_dev, uctx, > + &connector_handle); > + if (ret) { > + hvnd_error("hvnd_cr_connector failed ret=%d\n", ret); > + goto con_cr_err; > + } > + > + /* > + * Now get a connection if one is pending. > + */ > + connector->ep_handle = connector_handle; > + connector->parent = listener; > + > + if (!ep_add_work_pending(connector)) > + goto get_connection_err; > + > + ret = hvnd_get_connection_listener(nd_dev, uctx, > + listener->ep_handle, > + connector_handle, > + (u64)connector); > + > + if (ret) { > + hvnd_debug("listener_get_connection failed\n"); > + ep_del_work_pending(connector); > + goto get_connection_err; > + } > + > + hvnd_acquire_uctx_ref(uctx); > + listener->outstanding_handle = connector_handle; > + listener->outstanding_ep = connector; > + hvnd_debug("outstanding handle is %p\n", (void *)connector_handle); > + return 0; > + > +get_connection_err: > + hvnd_free_handle(nd_dev, uctx, > + connector_handle, > + IOCTL_ND_CONNECTOR_FREE); > + > +con_cr_err: > + kfree(connector); > +con_alloc_err: > +// hvnd_manage_io_state(listener, true); > + return ret; > +} > + > +static int hvnd_create_listen(struct iw_cm_id *cm_id, int backlog) > +{ > + int ret = 0; > + struct hvnd_dev *nd_dev; > + struct hvnd_ucontext *uctx; > + struct hvnd_ep_obj *ep_object; > + union nd_sockaddr_inet addr; > + union nd_sockaddr_inet local_addr; > + u64 listener_handle; > + struct sockaddr_in *laddr = (struct sockaddr_in *)&cm_id->local_addr; > + union nd_sockaddr_inet og_addr; > + > + > + nd_dev = to_nd_dev(cm_id->device); > + uctx = get_uctx(nd_dev, current_pid()); > + hvnd_debug("uctx is %p; pid is %d\n", uctx, current_pid()); > + > + if (cm_id->local_addr.ss_family != AF_INET) { > + hvnd_error("cm_id->local_addr.ss_family =%d not AF_INET\n", > + cm_id->local_addr.ss_family); > + return -ENOSYS; > + } > + > + /* > + * If the local address is LOOPBACK or INADDR_ANY, get an an address > + * to bind the listener. For now, just get the first address > + * available. > + */ > + > + if (IN_LOOPBACK(ntohl(laddr->sin_addr.s_addr)) || > + (laddr->sin_addr.s_addr == INADDR_ANY)) { > + > + hvnd_debug("need to get an address\n"); > + ret = hvnd_get_outgoing_rdma_addr(nd_dev, uctx, &og_addr); > + > + if (ret) { > + hvnd_error("failed to get the og address\n"); > + return ret; > + } > + > + laddr->sin_addr.s_addr = og_addr.ipv4.sin_addr.s_addr; > + } > + > + cm_id->add_ref(cm_id); > + > + ep_object = hvnd_setup_ep(cm_id, ND_LISTENER, nd_dev, uctx); > + > + if (!ep_object) { > + hvnd_error("hvnd_setup_ep returned NULL\n"); > + goto alloc_err; > + } > + > + ret = hvnd_cr_listener(nd_dev, uctx, > + &listener_handle); > + if (ret) { > + hvnd_error("hvnd_cr_listener failed ret=%d\n", ret); > + goto cr_err; > + } > + > + ep_object->ep_handle = listener_handle; > + > + cm_id->provider_data = ep_object; > + > + /* > + * Now bind the listener. > + * IPV4 support only. > + */ > + memcpy(&addr.ipv4, laddr, sizeof(struct sockaddr_in)); > + > + ret = hvnd_bind_listener(nd_dev, uctx, > + listener_handle, > + &addr); > + if (ret) { > + hvnd_error("hvnd_bind_listener failed ret=%d\n", ret); > + goto bind_err; > + } > + > + /* > + * Now get the local address. > + */ > + ret = hvnd_get_addr_listener(nd_dev, uctx, > + listener_handle, > + &local_addr); > + if (ret) { > + hvnd_error("hvnd_get_addr_listener failed ret=%d\n", ret); > + goto bind_err; > + } > + > + /* > + * Now put the listener in the listen mode. > + */ > + > + ret = hvnd_listen_listener(nd_dev, uctx, > + listener_handle, > + backlog); > + > + if (ret) { > + hvnd_error("hvnd_listen_listener failed ret=%d\n", ret); > + goto bind_err; > + } > + > + > + /* > + * Now get a pending connection if one is pending. > + */ > + ret = hvnd_get_incoming_connections(ep_object, nd_dev, uctx); > + if (ret) { > + hvnd_error("hvnd_get_incoming_connections failed ret=%d\n", > + ret); > + goto bind_err; > + } > + > + hvnd_acquire_uctx_ref(uctx); > + hvnd_debug("cm_id=%p\n", cm_id); > + return 0; > + > +bind_err: > + hvnd_free_handle(nd_dev, uctx, > + listener_handle, > + IOCTL_ND_LISTENER_FREE); > +cr_err: > + kfree(ep_object); > +alloc_err: > + cm_id->provider_data = NULL; > + cm_id->rem_ref(cm_id); > + return ret; > +} > + > +static int hvnd_destroy_listen(struct iw_cm_id *cm_id) > +{ > + struct hvnd_dev *nd_dev; > + struct hvnd_ucontext *uctx; > + struct hvnd_ep_obj *ep_object; > + > + nd_dev = to_nd_dev(cm_id->device); > + > + ep_object = (struct hvnd_ep_obj *)cm_id->provider_data; > + > + hvnd_debug("uctx is %p\n", ep_object->uctx); > + hvnd_debug("Destroying Listener cm_id=%p\n", cm_id); > + uctx = ep_object->uctx; > + > + // make sure there is nothing in progress on this ep > + ep_stop(ep_object); > + > + hvnd_free_handle(nd_dev, uctx, > + ep_object->ep_handle, > + IOCTL_ND_LISTENER_FREE); > + > + /* > + * We may have an ouststanding connector for > + * incoming connection requests; clean it up. > + */ > + > + if (ep_object->outstanding_handle != 0) { > + > + // make sure there is nothing in progress on this ep > + ep_stop(ep_object->outstanding_ep); > + > + hvnd_free_handle(nd_dev, uctx, > + ep_object->outstanding_handle, > + IOCTL_ND_CONNECTOR_FREE); > + > + > + hvnd_drop_uctx_ref(nd_dev, uctx); > + hvnd_destroy_ep(ep_object->outstanding_ep); > + } > + > + /* > + * Now everything should have stopped > + */ > + > + cm_id->rem_ref(cm_id); > + hvnd_destroy_ep(ep_object); > + cm_id->provider_data = NULL; > + hvnd_drop_uctx_ref(nd_dev, uctx); > + > + hvnd_debug("cm_id=%p\n", cm_id); > + return 0; > +} > + > +static void hvnd_qp_add_ref(struct ib_qp *ibqp) > +{ > + struct hvnd_qp *qp = to_nd_qp(ibqp); > + > + atomic_inc(&qp->refcnt); > +} > + > +void hvnd_qp_rem_ref(struct ib_qp *ibqp) > +{ > + struct hvnd_qp *qp = to_nd_qp(ibqp); > + > + if (atomic_dec_and_test(&qp->refcnt)) > + wake_up(&qp->wait); > +} > + > +static DEVICE_ATTR(hw_rev, 0444, hvnd_show_rev, NULL); > +static DEVICE_ATTR(fw_ver, 0444, hvnd_show_fw_ver, NULL); > +static DEVICE_ATTR(hca_type, 0444, hvnd_show_hca, NULL); > +static DEVICE_ATTR(board_id, 0444, hvnd_show_board, NULL); > + > +static struct device_attribute *hvnd_class_attributes[] = { > + &dev_attr_hw_rev, > + &dev_attr_fw_ver, > + &dev_attr_hca_type, > + &dev_attr_board_id, > +}; > + > +int hvnd_register_device(struct hvnd_dev *dev, char *ip_addr, char *mac_addr) > +{ > + int ret; > + > + dev->ibdev.owner = THIS_MODULE; > + dev->device_cap_flags = IB_DEVICE_LOCAL_DMA_LKEY | IB_DEVICE_MEM_WINDOW; > + dev->ibdev.local_dma_lkey = 0; > + dev->ibdev.uverbs_cmd_mask = > + (1ull << IB_USER_VERBS_CMD_GET_CONTEXT) | > + (1ull << IB_USER_VERBS_CMD_QUERY_DEVICE) | > + (1ull << IB_USER_VERBS_CMD_QUERY_PORT) | > + (1ull << IB_USER_VERBS_CMD_ALLOC_PD) | > + (1ull << IB_USER_VERBS_CMD_DEALLOC_PD) | > + (1ull << IB_USER_VERBS_CMD_REG_MR) | > + (1ull << IB_USER_VERBS_CMD_DEREG_MR) | > + (1ull << IB_USER_VERBS_CMD_CREATE_COMP_CHANNEL) | > + (1ull << IB_USER_VERBS_CMD_CREATE_CQ) | > + (1ull << IB_USER_VERBS_CMD_DESTROY_CQ) | > + (1ull << IB_USER_VERBS_CMD_REQ_NOTIFY_CQ) | > + (1ull << IB_USER_VERBS_CMD_CREATE_QP) | > + (1ull << IB_USER_VERBS_CMD_MODIFY_QP) | > + (1ull << IB_USER_VERBS_CMD_QUERY_QP) | > + (1ull << IB_USER_VERBS_CMD_POLL_CQ) | > + (1ull << IB_USER_VERBS_CMD_DESTROY_QP) | > + (1ull << IB_USER_VERBS_CMD_POST_SEND) | > + (1ull << IB_USER_VERBS_CMD_POST_RECV); > + dev->ibdev.node_type = RDMA_NODE_RNIC; > + memcpy(dev->ibdev.node_desc, HVND_NODE_DESC, sizeof(HVND_NODE_DESC)); > + memcpy(&dev->ibdev.node_guid, mac_addr, 6); > + dev->ibdev.phys_port_cnt = 1; //dev->nports; > + dev->ibdev.num_comp_vectors = 1; > + dev->ibdev.query_device = hvnd_query_device; > + dev->ibdev.query_port = hvnd_query_port; > + dev->ibdev.get_link_layer = hvnd_get_link_layer; > + dev->ibdev.query_pkey = hvnd_query_pkey; > + dev->ibdev.query_gid = hvnd_query_gid; > + dev->ibdev.alloc_ucontext = hvnd_alloc_ucontext; > + dev->ibdev.dealloc_ucontext = hvnd_dealloc_ucontext; > + dev->ibdev.mmap = hvnd_mmap; > + dev->ibdev.alloc_pd = hvnd_allocate_pd; > + dev->ibdev.dealloc_pd = hvnd_deallocate_pd; > + dev->ibdev.create_ah = hvnd_ah_create; > + dev->ibdev.destroy_ah = hvnd_ah_destroy; > + dev->ibdev.create_qp = hvnd_ib_create_qp; > + dev->ibdev.modify_qp = hvnd_ib_modify_qp; > + dev->ibdev.query_qp = hvnd_ib_query_qp; > + dev->ibdev.destroy_qp = hvnd_destroy_qp; > + dev->ibdev.create_cq = hvnd_ib_create_cq; > + dev->ibdev.destroy_cq = hvnd_ib_destroy_cq; > + dev->ibdev.resize_cq = hvnd_resize_cq; > + dev->ibdev.poll_cq = hvnd_poll_cq; > + dev->ibdev.get_dma_mr = hvnd_get_dma_mr; > + dev->ibdev.reg_user_mr = hvnd_reg_user_mr; > + dev->ibdev.dereg_mr = hvnd_dereg_mr; > + dev->ibdev.alloc_mw = hvnd_alloc_mw; > + dev->ibdev.dealloc_mw = hvnd_dealloc_mw; > + dev->ibdev.attach_mcast = hvnd_multicast_attach; > + dev->ibdev.detach_mcast = hvnd_multicast_detach; > + dev->ibdev.process_mad = hvnd_process_mad; > + dev->ibdev.req_notify_cq = hvnd_arm_cq; > + dev->ibdev.post_send = hvnd_post_send; > + dev->ibdev.post_recv = hvnd_post_receive; > + dev->ibdev.uverbs_abi_ver = MLX4_IB_UVERBS_ABI_VERSION; > + > + dev->ibdev.get_port_immutable = hvnd_get_port_immutable; > + > + //DMA ops for mapping all possible addresses > + dev->ibdev.dev.parent = &(dev->hvdev->device); > + dev->ibdev.dev.dma_ops = &vmbus_dma_ops; > + dev->ibdev.dev.dma_mask = (u64 *) DMA_BIT_MASK(64); > + dev->ibdev.dev.coherent_dma_mask = DMA_BIT_MASK(64); > + > + dev->ibdev.iwcm = kmalloc(sizeof(struct iw_cm_verbs), GFP_KERNEL); > + if (!dev->ibdev.iwcm) > + return -ENOMEM; > + > + dev->ibdev.iwcm->connect = hvnd_connect; > + dev->ibdev.iwcm->accept = hvnd_accept_cr; > + dev->ibdev.iwcm->reject = hvnd_reject_cr; > + dev->ibdev.iwcm->create_listen = hvnd_create_listen; > + dev->ibdev.iwcm->destroy_listen = hvnd_destroy_listen; > + dev->ibdev.iwcm->add_ref = hvnd_qp_add_ref; > + dev->ibdev.iwcm->rem_ref = hvnd_qp_rem_ref; > + dev->ibdev.iwcm->get_qp = hvnd_get_qp; > + > + strlcpy(dev->ibdev.name, "mlx4_%d", IB_DEVICE_NAME_MAX); > + ret = ib_register_device(&dev->ibdev, NULL); > + if (ret) { > + hvnd_error("ib_register_device failed ret=%d\n", ret); > + goto bail1; > + } > + > +#if 0 > + for (i = 0; i < ARRAY_SIZE(hvnd_class_attributes); ++i) { > + ret = device_create_file(&dev->ibdev.dev, > + hvnd_class_attributes[i]); > + if (ret) { > + hvnd_error("device_create_file failed ret=%d\n", ret); > + goto bail2; > + } > + } > +#endif > + > + dev->ib_active = true; > + return 0; > +//bail2: > +// ib_unregister_device(&dev->ibdev); > +bail1: > + kfree(dev->ibdev.iwcm); > + return ret; > +} > + > +void hvnd_unregister_device(struct hvnd_dev *dev) > +{ > + int i; > + > + for (i = 0; i < ARRAY_SIZE(hvnd_class_attributes); ++i) > + device_remove_file(&dev->ibdev.dev, > + hvnd_class_attributes[i]); > + ib_unregister_device(&dev->ibdev); > + kfree(dev->ibdev.iwcm); > + ib_dealloc_device((struct ib_device *)dev); > +} > + > +static int hvnd_try_bind_nic(unsigned char *mac, __be32 ip) > +{ > + int ret; > + struct hvnd_dev *nd_dev = g_nd_dev; > + > + mutex_lock(&nd_dev->bind_mutex); > + if (nd_dev->bind_complete) { > + mutex_unlock(&nd_dev->bind_mutex); > + return 1; > + } > + > + memcpy(nd_dev->mac_addr, mac, 6); > + *(__be32 *)(nd_dev->ip_addr) = ip; > + > + /* > + * Bind the NIC. > + */ > + hvnd_info("trying to bind to IP %pI4 MAC %pM\n", > + nd_dev->ip_addr, nd_dev->mac_addr); > + ret = hvnd_bind_nic(nd_dev, false, nd_dev->ip_addr, nd_dev->mac_addr); > + if (ret || nd_dev->bind_pkt.pkt_hdr.status) { > + mutex_unlock(&nd_dev->bind_mutex); > + return 1; > + } > + > + /* if we reach here, this means bind_nic is a success */ > + hvnd_error("successfully bound to IP %pI4 MAC %pM\n", > + nd_dev->ip_addr, nd_dev->mac_addr); > + complete(&nd_dev->addr_set); > + nd_dev->bind_complete = 1; > + mutex_unlock(&nd_dev->bind_mutex); > + > + ret = hvnd_register_device(nd_dev, nd_dev->ip_addr, nd_dev->mac_addr); > + > + if (!ret) > + return 0; > + > + hvnd_error("hvnd_register_device failed ret=%d\n", ret); > + > + /* roll back all allocated resources on error */ > + iounmap(nd_dev->mmio_virt); > + release_resource(&nd_dev->mmio_resource); > + > + vmbus_close(nd_dev->hvdev->channel); > + ib_dealloc_device((struct ib_device *)nd_dev); > + > + return 1; > +} > + > +static void hvnd_inetaddr_event_up(unsigned long event, struct in_ifaddr *ifa) > +{ > + hvnd_try_bind_nic(ifa->ifa_dev->dev->dev_addr, ifa->ifa_address); > +} > + > +static int hvnd_inetaddr_event(struct notifier_block *notifier, > + unsigned long event, void *ptr) > +{ > + struct in_ifaddr *ifa = ptr; > + > + switch (event) { > + case NETDEV_UP: > + hvnd_inetaddr_event_up(event, ifa); > + break; > + default: > + hvnd_debug("Received inetaddr event %lu\n", event); > + } > + > + return NOTIFY_DONE; > +} > + > +static struct notifier_block hvnd_inetaddr_notifier = { > + .notifier_call = hvnd_inetaddr_event, > +}; > + > +static int start_bind_nic(void) > +{ > + struct net_device *dev; > + struct in_device *idev; > + struct in_ifaddr *ifa; > + > + register_inetaddr_notifier(&hvnd_inetaddr_notifier); > + > + rtnl_lock(); > + for_each_netdev(&init_net, dev) { > + idev = in_dev_get(dev); > + if (!idev) > + continue; > + for (ifa = (idev)->ifa_list; > + ifa && !(ifa->ifa_flags&IFA_F_SECONDARY); > + ifa = ifa->ifa_next) { > + hvnd_try_bind_nic(dev->dev_addr, ifa->ifa_address); > + } > + } > + rtnl_unlock(); > + > + return 0; > +} > + > +static int hvnd_probe(struct hv_device *dev, > + const struct hv_vmbus_device_id *dev_id) > +{ > + struct hvnd_dev *nd_dev; > + int ret = 0; > + > + hvnd_debug("hvnd starting\n"); > + > + nd_dev = (struct hvnd_dev *)ib_alloc_device(sizeof(struct hvnd_dev)); > + if (!nd_dev) { > + ret = -ENOMEM; > + goto err_out0; > + } > + > + nd_dev->hvdev = dev; > + /* > + * We are going to masquerade as MLX4 device; > + * Set the vendor and device ID accordingly. > + */ > + dev->vendor_id = 0x15b3; //Mellanox > + dev->device_id = 0x1003; //Mellanox HCA > + INIT_LIST_HEAD(&nd_dev->listentry); > + spin_lock_init(&nd_dev->uctxt_lk); > + nd_dev->ib_active = false; > + > + /* > + * Initialize the state for the id table. > + */ > + spin_lock_init(&nd_dev->id_lock); > + idr_init(&nd_dev->cqidr); > + idr_init(&nd_dev->qpidr); > + idr_init(&nd_dev->mmidr); > + idr_init(&nd_dev->irpidr); > + idr_init(&nd_dev->uctxidr); > + > + atomic_set(&nd_dev->open_cnt, 0); > + > + sema_init(&nd_dev->query_pkt_sem, 1); > + > + ret = vmbus_open(dev->channel, HVND_RING_SZ, HVND_RING_SZ, NULL, 0, > + hvnd_callback, dev); > + > + if (ret) { > + hvnd_error("vmbus_open failed ret=%d\n", ret); > + goto err_out1; > + } > + > + hv_set_drvdata(dev, nd_dev); > + > + ret = hvnd_negotiate_version(nd_dev); > + > + if (ret) { > + hvnd_error("hvnd_negotiate_version failed ret=%d\n", ret); > + goto err_out2; > + } > + > + /* > + * Register resources with the host. > + */ > + ret = hvnd_init_resources(nd_dev); > + if (ret) { > + hvnd_error("hvnd_init_resources failed ret=%d\n", ret); > + goto err_out2; > + } > + > + /* > + * Try to bind every NIC to ND channel, > + * ND host will only return success for the correct one > + */ > + nd_dev->bind_complete = 0; > + mutex_init(&nd_dev->bind_mutex); > + init_completion(&nd_dev->addr_set); > + > + g_nd_dev = nd_dev; > + start_bind_nic(); > + > + return 0; > + > +err_out2: > + vmbus_close(dev->channel); > + > +err_out1: > + ib_dealloc_device((struct ib_device *)nd_dev); > + > +err_out0: > + return ret; > +} > + > +static int hvnd_remove(struct hv_device *dev) > +{ > + struct hvnd_dev *nd_dev = hv_get_drvdata(dev); > + > + unregister_inetaddr_notifier(&hvnd_inetaddr_notifier); > + hvnd_bind_nic(nd_dev, true, nd_dev->ip_addr, nd_dev->mac_addr); > + hvnd_unregister_device(nd_dev); > + vmbus_close(dev->channel); > + iounmap(nd_dev->mmio_virt); > + release_resource(&nd_dev->mmio_resource); > + return 0; > +} > + > +static const struct hv_vmbus_device_id id_table[] = { > + /* VMBUS RDMA class guid */ > + /* 8c2eaf3d-32a7-4b09-ab99-bd1f1c86b501 */ > + { HV_ND_GUID, }, > + { }, > +}; > + > +MODULE_DEVICE_TABLE(vmbus, id_table); > + > +static struct hv_driver hvnd_drv = { > + .name = "hv_guest_rdma", > + .id_table = id_table, > + .probe = hvnd_probe, > + .remove = hvnd_remove, > +}; > + > + > +static int __init init_hvnd_drv(void) > +{ > + > + pr_info("Registered HyperV networkDirect Driver\n"); > + hvnd_addr_init(); > + return vmbus_driver_register(&hvnd_drv); > +} > + > +static void exit_hvnd_drv(void) > +{ > + pr_info("De-Registered HyperV networkDirect Driver\n"); > + hvnd_addr_deinit(); > + vmbus_driver_unregister(&hvnd_drv); > +} > + > + > +module_init(init_hvnd_drv); > +module_exit(exit_hvnd_drv); > + > +MODULE_DESCRIPTION("Hyper-V NetworkDirect Driver"); > +MODULE_LICENSE("GPL"); > +MODULE_VERSION("4.1.3"); > diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/user.h b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/user.h > new file mode 100644 > index 000000000000..07e6769ef43b > --- /dev/null > +++ b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/user.h > @@ -0,0 +1,107 @@ > +/* > + * Copyright (c) 2007 Cisco Systems, Inc. All rights reserved. > + * Copyright (c) 2007, 2008 Mellanox Technologies. All rights reserved. > + * > + * This software is available to you under a choice of one of two > + * licenses. You may choose to be licensed under the terms of the GNU > + * General Public License (GPL) Version 2, available from the file > + * COPYING in the main directory of this source tree, or the > + * OpenIB.org BSD license below: > + * > + * Redistribution and use in source and binary forms, with or > + * without modification, are permitted provided that the following > + * conditions are met: > + * > + * - Redistributions of source code must retain the above > + * copyright notice, this list of conditions and the following > + * disclaimer. > + * > + * - Redistributions in binary form must reproduce the above > + * copyright notice, this list of conditions and the following > + * disclaimer in the documentation and/or other materials > + * provided with the distribution. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, > + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF > + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND > + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS > + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN > + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN > + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE > + * SOFTWARE. > + */ > + > +#ifndef MLX4_IB_USER_H > +#define MLX4_IB_USER_H > + > +#include <linux/types.h> > + > +/* > + * Increment this value if any changes that break userspace ABI > + * compatibility are made. > + */ > + > +#define MLX4_IB_UVERBS_NO_DEV_CAPS_ABI_VERSION 3 > +#define MLX4_IB_UVERBS_ABI_VERSION 4 > + > +/* > + * Make sure that all structs defined in this file remain laid out so > + * that they pack the same way on 32-bit and 64-bit architectures (to > + * avoid incompatibility between 32-bit userspace and 64-bit kernels). > + * In particular do not use pointer types -- pass pointers in __u64 > + * instead. > + */ > + > +struct mlx4_ib_alloc_ucontext_resp_v3 { > + __u32 qp_tab_size; > + __u16 bf_reg_size; > + __u16 bf_regs_per_page; > +}; > + > +struct mlx4_ib_alloc_ucontext_resp { > + __u32 dev_caps; > + __u32 qp_tab_size; > + __u16 bf_reg_size; > + __u16 bf_regs_per_page; > + __u32 cqe_size; > +}; > + > +struct mlx4_ib_alloc_pd_resp { > + __u32 pdn; > + __u32 reserved; > +}; > + > +struct mlx4_ib_create_cq { > + __u64 buf_addr; > + __u64 db_addr; > +}; > + > +struct mlx4_ib_create_cq_resp { > + __u32 cqn; > + __u32 reserved; > +}; > + > +struct mlx4_ib_resize_cq { > + __u64 buf_addr; > +}; > + > +struct mlx4_ib_create_srq { > + __u64 buf_addr; > + __u64 db_addr; > +}; > + > +struct mlx4_ib_create_srq_resp { > + __u32 srqn; > + __u32 reserved; > +}; > + > +struct mlx4_ib_create_qp { > + __u64 buf_addr; > + __u64 db_addr; > + __u8 log_sq_bb_count; > + __u8 log_sq_stride; > + __u8 sq_no_prefetch; > + __u8 reserved[5]; > +}; > + > +#endif /* MLX4_IB_USER_H */ > diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/vmbus_rdma.c b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/vmbus_rdma.c > new file mode 100644 > index 000000000000..daf68dcccb1b > --- /dev/null > +++ b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/vmbus_rdma.c > @@ -0,0 +1,3074 @@ > +/* > + * Copyright (c) 2014, Microsoft Corporation. > + * > + * Author: > + * K. Y. Srinivasan <kys@microsoft.com> > + * > + * This program is free software; you can redistribute it and/or modify it > + * under the terms of the GNU General Public License version 2 as published > + * by the Free Software Foundation. > + * > + * This program is distributed in the hope that it will be useful, but > + * WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or > + * NON INFRINGEMENT. See the GNU General Public License for more > + * details. > + * > + * Bug fixes/enhancements: Long Li <longli@microsoft.com> > + */ > + > +#include <linux/completion.h> > +#include <linux/module.h> > +#include <linux/errno.h> > +#include <linux/hyperv.h> > +#include <linux/efi.h> > +#include <linux/slab.h> > +#include <linux/cred.h> > +#include <linux/sched.h> > +#include <linux/types.h> > +#include <linux/scatterlist.h> > +#include <rdma/ib_umem.h> > +#include <rdma/ib_user_verbs.h> > +#include <asm-generic/delay.h> > + > +#include "vmbus_rdma.h" > + > +/* > + * We only have a single rdma device on the host; > + * have a single receive buffer. > + */ > + > + > +static char hvnd_recv_buffer[PAGE_SIZE * 4]; > + > +static atomic_t irp_local_hdl; > + > +char *hvnd_get_op_name(int ioctl) > +{ > + switch (ioctl) { > + case IOCTL_ND_PROVIDER_INIT: > + return "IOCTL_ND_PROVIDER_INIT\n"; > + case IOCTL_ND_PROVIDER_BIND_FILE: > + return "IOCTL_ND_PROVIDER_BIND_FILE\n"; > + case IOCTL_ND_ADAPTER_OPEN: > + return "IOCTL_ND_ADAPTER_OPEN\n"; > + > + case IOCTL_ND_ADAPTER_CLOSE: > + return "IOCTL_ND_ADAPTER_CLOSE\n"; > + > + case IOCTL_ND_ADAPTER_QUERY: > + return "IOCTL_ND_ADAPTER_QUERY\n"; > + > + case IOCTL_ND_PD_CREATE: > + return "IOCTL_ND_PD_CREATE\n"; > + > + case IOCTL_ND_PD_FREE: > + return "IOCTL_ND_PD_FREE\n"; > + > + case IOCTL_ND_CQ_CREATE: > + return "IOCTL_ND_CQ_CREATE\n"; > + > + case IOCTL_ND_CQ_FREE: > + return "IOCTL_ND_CQ_FREE\n"; > + case IOCTL_ND_CQ_CANCEL_IO: > + return "IOCTL_ND_CQ_CANCEL_IO\n"; > + case IOCTL_ND_CQ_GET_AFFINITY: > + return "IOCTL_ND_CQ_GET_AAFINITY\n"; > + case IOCTL_ND_CQ_MODIFY: > + return "IOCTL_ND_CQ_MODIFY\n"; > + > + case IOCTL_ND_CQ_NOTIFY: > + return "IOCTL_ND_CQ_NOTIFY\n"; > + > + > + case IOCTL_ND_LISTENER_CREATE: > + return "IOCTL_ND_LISTENER_CREATE\n"; > + > + case IOCTL_ND_LISTENER_FREE: > + return "IOCTL_ND_LISTENER_FREE\n"; > + > + case IOCTL_ND_QP_FREE: > + return "IOCTL_ND_QP_FREE\n"; > + > + case IOCTL_ND_CONNECTOR_CANCEL_IO: > + return "IOCTL_ND_CONNECTOR_CANCEL_IO\n"; > + > + case IOCTL_ND_LISTENER_CANCEL_IO: > + return "IOCTL_ND_LISTENER_CANCEL_IO\n"; > + > + case IOCTL_ND_LISTENER_BIND: > + return "IOCTL_ND_LISTENER_BIND\n"; > + > + case IOCTL_ND_LISTENER_LISTEN: > + return "IOCTL_ND_LISTENER_LISTEN\n"; > + > + case IOCTL_ND_LISTENER_GET_ADDRESS: > + return "IOCTL_ND_LISTENER_GET_ADDRESS\n"; > + > + case IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST: > + return "IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST\n"; > + > + > + > + case IOCTL_ND_CONNECTOR_CREATE: > + return "IOCTL_ND_CONNECTOR_CREATE\n"; > + > + case IOCTL_ND_CONNECTOR_FREE: > + return "IOCTL_ND_CONNECTOR_FREE\n"; > + > + case IOCTL_ND_CONNECTOR_BIND: > + return "IOCTL_ND_CONNECTOR_BIND\n"; > + > + // KYS: ALERT: ASYNCH Operation > + case IOCTL_ND_CONNECTOR_CONNECT: > + return "IOCTL_ND_CONNECTOR_CONNECT\n"; > + > + case IOCTL_ND_CONNECTOR_COMPLETE_CONNECT: > + return "IOCTL_ND_CONNECTOR_COMPLETE_CONNECT\n"; > + > + // KYS: ALERT: ASYNCH Operation > + case IOCTL_ND_CONNECTOR_ACCEPT: > + return "IOCTL_ND_CONNECTOR_ACCEPT\n"; > + > + case IOCTL_ND_CONNECTOR_REJECT: > + return "IOCTL_ND_CONNECTOR_REJECT\n"; > + > + case IOCTL_ND_CONNECTOR_GET_READ_LIMITS: > + return "IOCTL_ND_CONNECTOR_GET_READ_LIMITS\n"; > + > + case IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA: > + return "IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA\n"; > + > + case IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS: > + return "IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS\n"; > + > + case IOCTL_ND_CONNECTOR_GET_ADDRESS: > + return "IOCTL_ND_CONNECTOR_GET_ADDRESS\n"; > + > + // KYS: ALERT: ASYNCH Operation > + case IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT: > + return "IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT\n"; > + > + // KYS: ALERT: ASYNCH Operation > + case IOCTL_ND_CONNECTOR_DISCONNECT: > + return "IOCTL_ND_CONNECTOR_DISCONNECT\n"; > + > + > + > + case IOCTL_ND_QP_CREATE: > + return "IOCTL_ND_QP_CREATE\n"; > + > + case IOCTL_ND_MR_CREATE: > + return "IOCTL_ND_MR_CREATE\n"; > + > + case IOCTL_ND_MR_FREE: > + return "IOCTL_ND_MR_FREE\n"; > + case IOCTL_ND_MR_REGISTER: > + return "IOCTL_ND_MR_REGISTER\n"; > + case IOCTL_ND_MR_DEREGISTER: > + return "IOCTL_ND_MR_DEREGISTER\n"; > + case IOCTL_ND_MR_CANCEL_IO: > + return "IOCTL_ND_MR_CANCEL_IO\n"; > + case IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST: > + return "IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST\n"; > + case IOCTL_ND_QP_FLUSH: > + return "IOCTL_ND_QP_FLUSH\n"; > + > + default: > + return "Unknown IOCTL\n"; > + } > +} > +int get_irp_handle(struct hvnd_dev *nd_dev, u32 *local, void *irp_ctx) > +{ > + unsigned int local_handle; > + int ret; > + > + local_handle = atomic_inc_return(&irp_local_hdl); > + *local = local_handle; > + > + /* > + * Now asssociate the local handle with the pointer. > + */ > + ret = insert_handle(nd_dev, &nd_dev->irpidr, irp_ctx, local_handle); > + hvnd_debug("irp_ctx=%p local_handle=%u\n", irp_ctx, local_handle); > + > + if (ret) { > + hvnd_error("insert_handle failed ret=%d\n", ret); > + return ret; > + } > + > + return 0; > +} > + > +void put_irp_handle(struct hvnd_dev *nd_dev, u32 irp) > +{ > + remove_handle(nd_dev, &nd_dev->irpidr, irp); > + > +} > + > +static void init_pfn(u64 *pfn, void *addr, u32 length) > +{ > + int i; > + u32 offset = offset_in_page(addr); > + u32 num_pfn = DIV_ROUND_UP(offset + length, PAGE_SIZE); > + > + for (i = 0; i < num_pfn; i++) > + pfn[i] = virt_to_phys((u8 *)addr + > + (PAGE_SIZE * i)) >> PAGE_SHIFT; > +} > + > + > +static void user_va_init_pfn(u64 *pfn, struct ib_umem *umem) > +{ > + int entry; > + struct scatterlist *sg; > + int i = 0; > + > + for_each_sg(umem->sg_head.sgl, sg, umem->nmap, entry) { > + pfn[i++] = page_to_pfn(sg_page(sg)); > + } > +} > + > +static u32 get_local_handle(void *p) > +{ > + u64 val = (unsigned long)p; > + > + return (u32)val; > +} > + > +static int hvnd_send_pg_buffer(struct hvnd_dev *nd_dev, > + struct vmbus_packet_mpb_array *desc, > + u32 desc_size, > + void *buffer, > + u32 bufferlen, u64 cookie) > +{ > + int ret; > + int t; > + struct hvnd_cookie hvnd_cookie; > + > + hvnd_cookie.pkt = (void *)cookie; > + init_completion(&hvnd_cookie.host_event); > + > + ret = vmbus_sendpacket_mpb_desc(nd_dev->hvdev->channel, > + desc, > + desc_size, > + buffer, bufferlen, > + (u64)(&hvnd_cookie)); > + > + if (ret) { > + hvnd_error("vmbus_sendpacket_mpb_desc failed ret=%d\n", ret); > + goto err; > + } > + > + t = wait_for_completion_timeout(&hvnd_cookie.host_event, 500*HZ); > + > + if (t == 0) { > + hvnd_error("wait_for_completion_timeout timed out\n"); > + ret = -ETIMEDOUT; > + } > + > +err: > + return ret; > +} > + > +static int hvnd_send_packet(struct hvnd_dev *nd_dev, void *buffer, > + u32 bufferlen, u64 cookie, bool block) > +{ > + int ret; > + int t; > + struct hvnd_cookie hvnd_cookie; > + > + hvnd_cookie.pkt = (void *)cookie; > + init_completion(&hvnd_cookie.host_event); > + > + ret = vmbus_sendpacket(nd_dev->hvdev->channel, buffer, bufferlen, > + (u64)(&hvnd_cookie), VM_PKT_DATA_INBAND, > + VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED); > + > + if (ret) { > + hvnd_error("vmbus_send pkt failed: %d\n", ret); > + goto err; > + } > + > + if (!block) > + return ret; > + > + t = wait_for_completion_timeout(&hvnd_cookie.host_event, 500*HZ); > + > + if (t == 0) { > + hvnd_error("wait_for_completion_timeout timed out\n"); > + ret = -ETIMEDOUT; > + } > + > +err: > + return ret; > +} > + > +static int hvnd_send_pgbuf_ioctl_pkt(struct hvnd_dev *nd_dev, > + struct vmbus_packet_mpb_array *desc, > + u32 desc_size, > + struct ndv_packet_hdr_control_1 *hdr, > + u32 pkt_size, u64 cookie) > +{ > + int ret; > + int ioctl; > + > + ioctl = hdr->io_cntrl_code; > + > + > + ret = hvnd_send_pg_buffer(nd_dev, desc, desc_size, > + hdr, pkt_size, cookie); > + > + if (ret) > + return ret; > + > + if (hdr->pkt_hdr.status != 0) { > + hvnd_error("IOCTL: %s failed; status is %x\n", > + hvnd_get_op_name(ioctl), > + hdr->pkt_hdr.status); > + return -EINVAL; > + } > + > + switch (hdr->io_status) { > + case STATUS_SUCCESS: > + case STATUS_PENDING: > + return 0; > + > + default: > + hvnd_error("IOCTL: %s failed io status is %x\n", > + hvnd_get_op_name(ioctl), hdr->io_status); > + return -EINVAL; > + } > +} > + > +int hvnd_send_ioctl_pkt(struct hvnd_dev *nd_dev, > + struct ndv_packet_hdr_control_1 *hdr, > + u32 pkt_size, u64 cookie) > +{ > + int ret; > + int ioctl; > + bool block; > + > + block = (hdr->irp_handle.val64 == 0) ? true : false; > + > + > + ioctl = hdr->io_cntrl_code; > + > + ret = hvnd_send_packet(nd_dev, hdr, pkt_size, cookie, block); > + > + if (ret) > + return ret; > + > + if (!block) > + return ret; > + > + if (hdr->pkt_hdr.status != 0) { > + hvnd_error("IOCTL: %s failed; status is %x\n", > + hvnd_get_op_name(ioctl), hdr->pkt_hdr.status); > + return -EINVAL; > + } > + > + switch (hdr->io_status) { > + case STATUS_SUCCESS: > + case STATUS_PENDING: > + return 0; > + > + default: > + hvnd_warn("IOCTL: %s failed io status is %x\n", > + hvnd_get_op_name(ioctl), hdr->io_status); > + return -EINVAL; > + } > +} > + > +void hvnd_init_hdr(struct ndv_packet_hdr_control_1 *hdr, > + u32 data_sz, u32 local, u32 remote, > + u32 ioctl_code, > + u32 ext_data_sz, u32 ext_data_offset, > + u64 irp_handle) > + > +{ > + int pkt_type; > + > + pkt_type = NDV_PKT_ID1_CONTROL; > + NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); > + hdr->pkt_hdr.packet_type = pkt_type; > + hdr->pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); > + hdr->pkt_hdr.data_sz = data_sz; > + > + hdr->pkt_hdr.status = 0; > + > + hdr->file_handle.local = local; > + hdr->file_handle.remote = remote; > + hdr->irp_handle.val64 = irp_handle; > + > + hdr->io_cntrl_code = ioctl_code; > + hdr->output_buf_sz = data_sz - ext_data_sz; > + hdr->input_buf_sz = data_sz - ext_data_sz; > + > + hdr->input_output_buf_offset = 0; > + > + hdr->extended_data.size = ext_data_sz; > + hdr->extended_data.offset = ext_data_offset; > +} > + > + > +int hvnd_create_file(struct hvnd_dev *nd_dev, void *uctx, > + struct ndv_pkt_hdr_create_1 *create, u32 file_flags) > +{ > + int ret; > + int pkt_type; > + > + > + pkt_type = NDV_PKT_ID1_CREATE; > + NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); > + create->pkt_hdr.packet_type = pkt_type; > + create->pkt_hdr.hdr_sz = sizeof(struct ndv_pkt_hdr_create_1); > + create->pkt_hdr.data_sz = 0; > + > + create->handle.local = get_local_handle(uctx); > + create->access_mask = STANDARD_RIGHTS_ALL; > + create->open_options = OPEN_EXISTING; > + create->file_attributes = FILE_ATTRIBUTE_NORMAL | file_flags; > + create->share_access = FILE_SHARE_ALL; > + > + ret = hvnd_send_packet(nd_dev, create, > + sizeof(struct ndv_pkt_hdr_create_1), > + (unsigned long)create, true); > + return ret; > +} > + > +int hvnd_cleanup_file(struct hvnd_dev *nd_dev, u32 local, u32 remote) > +{ > + int ret; > + int pkt_type; > + struct ndv_pkt_hdr_cleanup_1 cleanup_pkt; > + > + > + pkt_type = NDV_PKT_ID1_CLEANUP; > + NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); > + > + cleanup_pkt.pkt_hdr.packet_type = pkt_type; > + cleanup_pkt.pkt_hdr.hdr_sz = sizeof(struct ndv_pkt_hdr_create_1); > + cleanup_pkt.pkt_hdr.data_sz = 0; > + > + cleanup_pkt.handle.local = local; > + cleanup_pkt.handle.remote = remote; > + > + ret = hvnd_send_packet(nd_dev, &cleanup_pkt, > + sizeof(struct ndv_pkt_hdr_create_1), > + (unsigned long)&cleanup_pkt, true); > + return ret; > +} > + > + > +static int hvnd_do_ioctl(struct hvnd_dev *nd_dev, u32 ioctl, > + struct pkt_nd_provider_ioctl *pkt, > + union ndv_context_handle *hdr_handle, > + struct nd_handle *ioctl_handle, > + u8 *buf, u32 buf_len, bool c_in, bool c_out, u64 irp_val) > +{ > + int ret; > + int pkt_type; > + > + pkt_type = NDV_PKT_ID1_CONTROL; > + NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); > + > + pkt->hdr.pkt_hdr.packet_type = pkt_type; > + pkt->hdr.pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); > + pkt->hdr.pkt_hdr.data_sz = (sizeof(struct pkt_nd_provider_ioctl) - > + sizeof(struct ndv_packet_hdr_control_1)); > + > + pkt->hdr.file_handle.local = hdr_handle->local; > + pkt->hdr.file_handle.remote = hdr_handle->remote; > + hvnd_debug("create handle local: %x remote: %x\n", > + hdr_handle->local, hdr_handle->remote); > + > + pkt->hdr.irp_handle.val64 = irp_val; > + > + pkt->hdr.io_cntrl_code = ioctl; > + pkt->hdr.output_buf_sz = sizeof(struct nd_ioctl); > + pkt->hdr.input_buf_sz = sizeof(struct nd_ioctl); > + pkt->hdr.input_output_buf_offset = 0; > + memset(&pkt->ioctl.handle, 0, sizeof(struct nd_handle)); > + pkt->ioctl.handle.version = ND_VERSION_1; > + > + switch (ioctl) { > + case IOCTL_ND_PROVIDER_BIND_FILE: > + pkt->ioctl.handle.handle = ioctl_handle->handle; > + break; > + default: > + break; > + }; > + > + /* > + * Copy the input buffer, if needed. > + */ > + > + if (c_in && (buf != NULL)) > + memcpy(pkt->ioctl.raw_buffer, buf, buf_len); > + > + ret = hvnd_send_packet(nd_dev, pkt, > + sizeof(struct pkt_nd_provider_ioctl), > + (unsigned long)pkt, true); > + > + if (ret) > + return ret; > + > + if (c_out && (buf != NULL)) > + memcpy(buf, pkt->ioctl.raw_buffer, buf_len); > + > + return ret; > +} > + > +static int idr_callback(int id, void *p, void *data) > +{ > + if (p == data) > + return id; > + return 0; > +} > + > +void remove_uctx(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx) > +{ > + int pid = current_pid(); > + unsigned long flags; > + int id; > + > + if (get_uctx(nd_dev, pid) == uctx) > + remove_handle(nd_dev, &nd_dev->uctxidr, pid); > + else { > + hvnd_warn("uctx %p not found on pid %d, doing a idr search\n", > + uctx, current_pid()); > + > + spin_lock_irqsave(&nd_dev->id_lock, flags); > + id = idr_for_each(&nd_dev->uctxidr, idr_callback, uctx); > + spin_unlock_irqrestore(&nd_dev->id_lock, flags); > + > + if (id) > + remove_handle(nd_dev, &nd_dev->uctxidr, id); > + else { > + hvnd_error("uctx %p not found in idr table\n", uctx); > + return; > + } > + } > + > + kfree(uctx); > +} > + > +int hvnd_close_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx) > +{ > + int ret; > + > + /* > + * First close the adaptor. > + */ > + > + ret = hvnd_free_handle(nd_dev, uctx, > + uctx->adaptor_hdl, > + IOCTL_ND_ADAPTER_CLOSE); > + > + if (ret) > + hvnd_error("Adaptor close failed; ret is %x\n", ret); > + > + /* > + * Now close the two files we created. > + */ > + > + ret = hvnd_cleanup_file(nd_dev, uctx->file_handle_ovl.local, > + uctx->file_handle_ovl.remote); > + > + if (ret) > + hvnd_error("file cleanup failed; ret is %x\n", ret); > + > + ret = hvnd_cleanup_file(nd_dev, uctx->file_handle.local, > + uctx->file_handle.remote); > + > + if (ret) > + hvnd_error("File cleanup failed; ret is %x\n", ret); > + > + /* > + * Remove the uctx from the ID table. > + */ > + remove_uctx(nd_dev, uctx); > + > + return 0; > +} > + > +int hvnd_open_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx) > +{ > + int ret; > + struct pkt_nd_provider_ioctl *pr_init = &uctx->pr_init_pkt; > + int pkt_type; > + struct nd_handle ioctl_handle; > + struct pkt_nd_open_adapter *pr_o_adap = &uctx->o_adap_pkt; > + > + ret = hvnd_create_file(nd_dev, uctx, &uctx->create_pkt, 0); > + if (ret) { > + hvnd_error("hvnd_create_file failed ret=%d\n", ret); > + goto error_cr; > + } > + > + if (uctx->create_pkt.pkt_hdr.status != 0) { > + hvnd_error("create File failed; status is %d\n", > + uctx->create_pkt.pkt_hdr.status); > + ret = -EINVAL; > + goto error_cr; > + } > + > + uctx->file_handle.local = uctx->create_pkt.handle.local; > + uctx->file_handle.remote = uctx->create_pkt.handle.remote; > + hvnd_debug("INITIALIZE PROVIDER\n"); > + /* > + * Now Initialize the Provider. > + */ > + ioctl_handle.handle = 0; > + ret = hvnd_do_ioctl(nd_dev, IOCTL_ND_PROVIDER_INIT, pr_init, > + &uctx->create_pkt.handle, > + &ioctl_handle, NULL, 0, false, false, 0); > + > + if (ret) { > + ret = -EINVAL; > + goto error_pr_init; > + } > + > + if (pr_init->hdr.pkt_hdr.status != 0) { > + hvnd_error("Provider INIT failed; status is %d\n", > + pr_init->hdr.pkt_hdr.status); > + ret = -EINVAL; > + goto error_pr_init; > + } > + > + if (pr_init->hdr.io_status != 0) { > + hvnd_error("Provider INIT failed; io status is %d\n", > + pr_init->hdr.io_status); > + ret = -EINVAL; > + goto error_pr_init; > + } > + > + /* > + * Now create the overlap file. > + */ > + > + hvnd_debug("CREATE OVERLAP FILE\n"); > + ret = hvnd_create_file(nd_dev, uctx, &uctx->create_pkt_ovl, > + FILE_FLAG_OVERLAPPED); > + if (ret) { > + hvnd_error("hvnd_create_file failed ret=%d\n", ret); > + goto error_pr_init; > + } > + > + if (uctx->create_pkt_ovl.pkt_hdr.status != 0) { > + hvnd_error("create Overlap File failed; status is %d\n", > + uctx->create_pkt_ovl.pkt_hdr.status); > + ret = -EINVAL; > + goto error_pr_init; > + } > + uctx->file_handle_ovl.local = uctx->create_pkt_ovl.handle.local; > + uctx->file_handle_ovl.remote = uctx->create_pkt_ovl.handle.remote; > + > + /* > + * Now bind the two file handles together. > + */ > + > + hvnd_debug("BIND FILE IOCTL remote handle: %d local handle: %d\n", > + uctx->create_pkt_ovl.handle.remote, > + uctx->create_pkt_ovl.handle.local); > + > + ioctl_handle.handle = uctx->create_pkt_ovl.handle.val64; > + ret = hvnd_do_ioctl(nd_dev, IOCTL_ND_PROVIDER_BIND_FILE, pr_init, > + &uctx->create_pkt.handle, > + &ioctl_handle, NULL, 0, false, false, 0); > + > + if (ret) { > + ret = -EINVAL; > + goto error_file_bind; > + } > + if (pr_init->hdr.pkt_hdr.status != 0) { > + hvnd_error("Provider File bind failed; status is %d\n", > + pr_init->hdr.pkt_hdr.status); > + ret = -EINVAL; > + goto error_file_bind; > + } > + if (pr_init->hdr.io_status != 0) { > + hvnd_error("Provider INIT failed; io status is %d\n", > + pr_init->hdr.io_status); > + ret = -EINVAL; > + goto error_file_bind; > + } > + > + /* > + * Now open the adaptor. > + */ > + > + hvnd_debug("OPENING THE ADAPTOR\n"); > + > + pkt_type = NDV_PKT_ID1_CONTROL; > + NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); > + pr_o_adap->hdr.pkt_hdr.packet_type = pkt_type; > + pr_o_adap->hdr.pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); > + pr_o_adap->hdr.pkt_hdr.data_sz = sizeof(struct pkt_nd_open_adapter) - > + sizeof(struct ndv_packet_hdr_control_1); > + > + pr_o_adap->hdr.pkt_hdr.status = 0; > + > + hvnd_debug("hdr sz is %d\n", pr_o_adap->hdr.pkt_hdr.hdr_sz); > + hvnd_debug("data sz is %d\n", pr_o_adap->hdr.pkt_hdr.data_sz); > + > + pr_o_adap->hdr.file_handle.local = uctx->create_pkt.handle.local; > + pr_o_adap->hdr.file_handle.remote = uctx->create_pkt.handle.remote; > + hvnd_debug("create handle local is %x\n", > + uctx->create_pkt.handle.local); > + hvnd_debug("create handle remote is %x\n", > + uctx->create_pkt.handle.remote); > + pr_o_adap->hdr.irp_handle.val64 = 0; > + > + pr_o_adap->hdr.io_cntrl_code = IOCTL_ND_ADAPTER_OPEN; > + pr_o_adap->hdr.output_buf_sz = pr_o_adap->hdr.pkt_hdr.data_sz - > + sizeof(struct extended_data_oad); > + pr_o_adap->hdr.input_buf_sz = pr_o_adap->hdr.pkt_hdr.data_sz - > + sizeof(struct extended_data_oad); > + > + hvnd_debug("output buf sz is %d\n", pr_o_adap->hdr.output_buf_sz); > + hvnd_debug("input buf sz is %d\n", pr_o_adap->hdr.input_buf_sz); > + hvnd_debug("packet size is %d\n", > + (int)sizeof(struct pkt_nd_open_adapter)); > + > + pr_o_adap->hdr.input_output_buf_offset = 0; > + > + > + pr_o_adap->hdr.extended_data.size = sizeof(struct extended_data_oad); > + pr_o_adap->hdr.extended_data.offset = > + offsetof(struct pkt_nd_open_adapter, ext_data) - > + sizeof(struct ndv_packet_hdr_control_1); > + > + hvnd_debug("size of the extended data size: %d\n", > + (int)sizeof(struct extended_data_oad)); > + hvnd_debug("offset of extended data: %d\n", > + pr_o_adap->hdr.extended_data.offset); > + > + /* > + * Now fill out the ioctl section. > + */ > + > + pr_o_adap->ioctl.input.version = ND_VERSION_1; > + pr_o_adap->ioctl.input.ce_mapping_cnt = > + RTL_NUMBER_OF(pr_o_adap->mappings.ctx_input.mappings); > + > + hvnd_debug("ce_mapping cnt is %d\n", > + pr_o_adap->ioctl.input.ce_mapping_cnt); > + > + pr_o_adap->ioctl.input.cb_mapping_offset = sizeof(union oad_ioctl); > + hvnd_debug("cb_mapping offset is %d\n", > + pr_o_adap->ioctl.input.cb_mapping_offset); > + pr_o_adap->ioctl.input.adapter_id = (u64)nd_dev; > + > + pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_UAR].map_type = ND_MAP_IOSPACE; > + pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_UAR].map_io_space.cache_type = ND_NON_CACHED; > + pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_UAR].map_io_space.cb_length = 4096; > + > + pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_BF].map_type = ND_MAP_IOSPACE; > + pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_BF].map_io_space.cache_type = ND_WRITE_COMBINED; > + pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_BF].map_io_space.cb_length = 4096; > + > + /* > + * Fill in the extended data. > + */ > + pr_o_adap->ext_data.cnt = IBV_GET_CONTEXT_MAPPING_MAX; > + > + ret = hvnd_send_packet(nd_dev, pr_o_adap, > + sizeof(struct pkt_nd_open_adapter), > + (unsigned long)pr_o_adap, true); > + if (ret) { > + ret = -EINVAL; > + goto error_file_bind; > + } > + > + if (pr_o_adap->hdr.pkt_hdr.status != 0) { > + hvnd_error("Open adaptor failed; status is %d\n", > + pr_o_adap->hdr.pkt_hdr.status); > + ret = -EINVAL; > + goto error_file_bind; > + } > + > + if (pr_o_adap->hdr.io_status != 0) { > + hvnd_error("Open adaptor failed;io status is %d\n", > + pr_o_adap->hdr.io_status); > + ret = -EINVAL; > + goto error_file_bind; > + } > + > + /* > + * Copy the necessary response from the host. > + */ > + > + uctx->adaptor_hdl = pr_o_adap->ioctl.resrc_desc.handle; > + > + > + hvnd_debug("adaptor handle: %p\n", (void *)uctx->adaptor_hdl); > + > + uctx->uar_base = > + pr_o_adap->mappings.ctx_output.mapping_results[IBV_GET_CONTEXT_UAR].info; > + hvnd_debug("uar base: %p\n", (void *)uctx->uar_base); > + > + uctx->bf_base = > + pr_o_adap->mappings.ctx_output.mapping_results[IBV_GET_CONTEXT_BF].info; > + hvnd_debug("bf base: %p\n", (void *)uctx->bf_base); > + > + uctx->bf_buf_size = > + pr_o_adap->mappings.ctx_output.bf_buf_size; > + hvnd_debug("bf buf size: %d\n", uctx->bf_buf_size); > + > + uctx->bf_offset = > + pr_o_adap->mappings.ctx_output.bf_offset; > + hvnd_debug("bf offset: %d\n", uctx->bf_offset); > + > + uctx->cqe_size = > + pr_o_adap->mappings.ctx_output.cqe_size; > + hvnd_debug("cqe size: %d\n", uctx->cqe_size); > + > + uctx->max_qp_wr = > + pr_o_adap->mappings.ctx_output.max_qp_wr; > + hvnd_debug("max qp wr: %d\n", uctx->max_qp_wr); > + > + uctx->max_sge = > + pr_o_adap->mappings.ctx_output.max_sge; > + hvnd_debug("max sge: %d\n", uctx->max_sge); > + > + uctx->max_cqe = > + pr_o_adap->mappings.ctx_output.max_cqe; > + hvnd_debug("max cqe: %d\n", uctx->max_cqe); > + > + uctx->num_qps = > + pr_o_adap->mappings.ctx_output.qp_tab_size; > + hvnd_debug("num qps: %d\n", uctx->num_qps); > + > + /* > + * Now query the adaptor and stash away the adaptor info. > + */ > + > + ret = hvnd_query_adaptor(nd_dev, uctx); > + if (ret) { > + hvnd_error("Query Adaptor failed; ret is %d\n", ret); > + goto query_err; > + } > + > + return ret; > + > +query_err: > + hvnd_free_handle(nd_dev, uctx, > + uctx->adaptor_hdl, > + IOCTL_ND_ADAPTER_CLOSE); > + > + hvnd_error("Open Adaptor Failed!!\n"); > + > +error_file_bind: > + hvnd_cleanup_file(nd_dev, uctx->file_handle_ovl.local, > + uctx->file_handle_ovl.remote); > + > +error_pr_init: > + hvnd_cleanup_file(nd_dev, uctx->file_handle.local, > + uctx->file_handle.remote); > + > +error_cr: > + if (get_uctx(nd_dev, current_pid()) != NULL) > + remove_handle(nd_dev, &nd_dev->uctxidr, current_pid()); > + > + return ret; > +} > + > +int hvnd_create_cq(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + struct hvnd_cq *cq) > +{ > + int ret; > + struct pkt_nd_create_cq *pkt; > + int num_pfn, num_db_pfn, num_sn_pfn; > + int cq_pkt_size; > + unsigned int cq_buf_size, offset; > + u32 ext_data_sz; > + u32 ext_data_offset; > + > + /* > + * Now create CQ. > + * First compute the number of PFNs we need to accomodate: > + * One each for door bell and arm_sn and pages in cq buffer. > + */ > + cq_buf_size = (cq->entries * uctx->cqe_size); > + offset = offset_in_page(cq->cq_buf); > + num_pfn = DIV_ROUND_UP(offset + cq_buf_size, PAGE_SIZE); > + > + offset = offset_in_page(cq->db_addr); > + num_db_pfn = DIV_ROUND_UP(offset + 8, PAGE_SIZE); > + > + offset = offset_in_page(&cq->arm_sn); > + num_sn_pfn = DIV_ROUND_UP(offset + 4, PAGE_SIZE); > + > + cq_pkt_size = sizeof(struct pkt_nd_create_cq) + > + (num_pfn * sizeof(u64)); > + > + ext_data_sz = sizeof(struct create_cq_ext_data) + > + (num_pfn * sizeof(u64)); > + ext_data_offset = offsetof(struct pkt_nd_create_cq, ext_data) - > + sizeof(struct ndv_packet_hdr_control_1); > + > + hvnd_debug("CREATE CQ, num user addr pfns is %d\n", num_pfn); > + hvnd_debug("CREATE CQ, num db pfns is %d\n", num_db_pfn); > + > + pkt = kzalloc(cq_pkt_size, GFP_KERNEL); > + > + if (!pkt) > + return -ENOMEM; > + > + hvnd_init_hdr(&pkt->hdr, > + (cq_pkt_size - > + sizeof(struct ndv_packet_hdr_control_1)), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_CQ_CREATE, > + ext_data_sz, > + ext_data_offset, > + 0); > + > + /* > + * Now fill out the ioctl section. > + */ > + > + pkt->ioctl.input.version = ND_VERSION_1; > + pkt->ioctl.input.queue_depth = cq->entries; > + pkt->ioctl.input.ce_mapping_cnt = MLX4_IB_CREATE_CQ_MAPPING_MAX; > + pkt->ioctl.input.cb_mapping_offset = sizeof(union create_cq_ioctl); > + > + hvnd_debug("ce_mapping cnt is %d\n", pkt->ioctl.input.ce_mapping_cnt); > + hvnd_debug("cb_mapping offset is %d\n", > + pkt->ioctl.input.cb_mapping_offset); > + > + pkt->ioctl.input.adapter_handle = uctx->adaptor_hdl; > + pkt->ioctl.input.affinity.mask = 0; > + pkt->ioctl.input.affinity.group = -1; > + > + // 0 for usermode CQ arming > + pkt->mappings.cq_in.flags = 0; > + > + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_BUF].map_memory.map_type = ND_MAP_MEMORY; > + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_BUF].map_memory.access_type = ND_MODIFY_ACCESS; > + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_BUF].map_memory.address = (u64)cq->cq_buf; > + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_BUF].map_memory.cb_length = (cq->entries * uctx->cqe_size); > + > + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_DB].map_memory.map_type = ND_MAP_MEMORY_COALLESCE; > + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_DB].map_memory.access_type = ND_WRITE_ACCESS; > + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_DB].map_memory.address = (u64)cq->db_addr; > + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_DB].map_memory.cb_length = 8; //size of two ints > + > + > + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_ARM_SN].map_memory.map_type = ND_MAP_MEMORY; > + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_ARM_SN].map_memory.access_type = ND_MODIFY_ACCESS; > + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_ARM_SN].map_memory.address = (u64)&cq->arm_sn; > + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_ARM_SN].map_memory.cb_length = 4; //size of one int > + /* > + * Fill in the extended data. > + */ > + > + pkt->ext_data.cnt = 3; > + pkt->ext_data.fields[MLX4_IB_CREATE_CQ_BUF].size = (sizeof(struct gpa_range) + (num_pfn * sizeof(u64))); > + pkt->ext_data.fields[MLX4_IB_CREATE_CQ_BUF].offset = offsetof(struct create_cq_ext_data, cqbuf_gpa); > + > + pkt->ext_data.fields[MLX4_IB_CREATE_CQ_DB].size = sizeof(struct cq_db_gpa); > + pkt->ext_data.fields[MLX4_IB_CREATE_CQ_DB].offset = offsetof(struct create_cq_ext_data, db_gpa); > + > + pkt->ext_data.fields[MLX4_IB_CREATE_CQ_ARM_SN].size = sizeof(struct cq_db_gpa); > + pkt->ext_data.fields[MLX4_IB_CREATE_CQ_ARM_SN].offset = offsetof(struct create_cq_ext_data, sn_gpa); > + > + /* > + * Fill up the gpa range for cq buffer. > + */ > + > + pkt->ext_data.db_gpa.byte_count = 8; > + pkt->ext_data.db_gpa.byte_offset = offset_in_page(cq->db_addr); > + user_va_init_pfn(&pkt->ext_data.db_gpa.pfn_array[0], cq->db_umem); > + > + pkt->ext_data.sn_gpa.byte_count = 4; > + pkt->ext_data.sn_gpa.byte_offset = offset_in_page(&cq->arm_sn); > + init_pfn(&pkt->ext_data.sn_gpa.pfn_array[0], > + &cq->arm_sn, > + 4); > + > + pkt->ext_data.cqbuf_gpa.byte_count = (cq->entries * uctx->cqe_size); > + pkt->ext_data.cqbuf_gpa.byte_offset = offset_in_page(cq->cq_buf); > + user_va_init_pfn(&pkt->ext_data.cqbuf_gpa.pfn_array[0], cq->umem); > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt->hdr, cq_pkt_size, (u64)pkt); > + > + if (ret) > + goto cr_cq_err; > + > + /* > + * Copy the necessary response from the host. > + */ > + cq->cqn = pkt->mappings.cq_resp.cqn; > + cq->cqe = pkt->mappings.cq_resp.cqe; > + cq->cq_handle = pkt->ioctl.resrc_desc.handle; > + > + ret = insert_handle(nd_dev, &nd_dev->cqidr, cq, cq->cqn); > + > + if (ret) > + goto cr_cq_err; > + hvnd_debug("CQ create after success cqn is %d\n", cq->cqn); > + hvnd_debug("CQ create after success cqe is %d\n", cq->cqe); > + hvnd_debug("CQ create after success cq handle is %p\n", > + (void *)cq->cq_handle); > + > +cr_cq_err: > + kfree(pkt); > + return ret; > +} > + > +int hvnd_destroy_cq(struct hvnd_dev *nd_dev, struct hvnd_cq *cq) > +{ > + struct pkt_nd_free_cq free_cq_pkt; > + int ret; > + > + // KYS try to avoid having to zero everything > + memset(&free_cq_pkt, 0, sizeof(free_cq_pkt)); > + hvnd_init_hdr(&free_cq_pkt.hdr, > + sizeof(struct pkt_nd_free_cq) - > + sizeof(struct ndv_packet_hdr_control_1), > + cq->uctx->create_pkt.handle.local, > + cq->uctx->create_pkt.handle.remote, > + IOCTL_ND_CQ_FREE, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + free_cq_pkt.ioctl.in.version = ND_VERSION_1; > + free_cq_pkt.ioctl.in.handle = cq->cq_handle; > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &free_cq_pkt.hdr, > + sizeof(struct pkt_nd_free_cq), > + (u64)&free_cq_pkt); > + > + if (ret) > + goto free_cq_err; > + > + remove_handle(nd_dev, &nd_dev->cqidr, cq->cqn); > + > + return 0; > + > +free_cq_err: > + return ret; > +} > + > + > +int hvnd_notify_cq(struct hvnd_dev *nd_dev, struct hvnd_cq *cq, > + u32 notify_type, u64 irp_handle) > +{ > + struct pkt_nd_notify_cq notify_cq_pkt; > + int ret; > + union ndv_context_handle irp_fhandle; > + > + irp_fhandle.local = cq->ep_object.local_irp; > + > + // KYS try to avoid having to zero everything > + memset(¬ify_cq_pkt, 0, sizeof(notify_cq_pkt)); > + hvnd_init_hdr(¬ify_cq_pkt.hdr, > + sizeof(struct pkt_nd_notify_cq) - > + sizeof(struct ndv_packet_hdr_control_1), > + cq->uctx->create_pkt.handle.local, > + cq->uctx->create_pkt.handle.remote, > + IOCTL_ND_CQ_NOTIFY, 0, 0, irp_fhandle.val64); > + > + /* > + * Now fill in the ioctl section. > + */ > + notify_cq_pkt.ioctl.in.version = ND_VERSION_1; > + notify_cq_pkt.ioctl.in.cq_handle = cq->cq_handle; > + notify_cq_pkt.ioctl.in.type = notify_type; > + > + > + ret = hvnd_send_ioctl_pkt(nd_dev, ¬ify_cq_pkt.hdr, > + sizeof(struct pkt_nd_notify_cq), > + (u64)¬ify_cq_pkt); > + > + return ret; > +} > + > +/* > + * Memory region operations. > + */ > +int hvnd_cr_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 pd_handle, u64 *mr_handle) > +{ > + struct pkt_nd_create_mr pkt; > + int ret; > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_MR_CREATE, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.version = ND_VERSION_1; > + pkt.ioctl.in.handle = pd_handle; > + hvnd_debug("PD handle is %p\n", (void *)pd_handle); > + pkt.ioctl.in.reserved = 0; > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + /* > + * Copy the handle. > + */ > + hvnd_debug("mr handle is %p\n", (void *)pkt.ioctl.out); > + *mr_handle = pkt.ioctl.out; > + > + return 0; > + > +err: > + hvnd_error("create mr failed: %d\n", ret); > + return ret; > + > +} > + > +int hvnd_free_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 handle) > +{ > + return hvnd_free_handle(nd_dev, uctx, handle, IOCTL_ND_MR_FREE); > +} > + > +int hvnd_deregister_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 handle) > +{ > + struct pkt_nd_deregister_mr pkt; > + int ret; > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_MR_DEREGISTER, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.version = ND_VERSION_1; > + pkt.ioctl.in.handle = handle; > + pkt.ioctl.in.reserved = 0; > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + return 0; > + > +err: > + hvnd_error("de-register mr failed: %d\n", ret); > + return ret; > + > +} > + > +static inline u32 hvnd_convert_access(int acc) > +{ > + return (acc & IB_ACCESS_REMOTE_WRITE ? ND_MR_FLAG_ALLOW_REMOTE_WRITE : 0) | > + (acc & IB_ACCESS_REMOTE_READ ? ND_MR_FLAG_ALLOW_REMOTE_READ : 0) | > + (acc & IB_ACCESS_LOCAL_WRITE ? ND_MR_FLAG_ALLOW_LOCAL_WRITE : 0); > +} > + > + > +int hvnd_mr_register(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + struct hvnd_mr *mr) > +{ > + struct pkt_nd_register_mr pkt; > + int ret; > + struct hv_mpb_array *pb; > + struct vmbus_packet_mpb_array *tpb; > + int sz_leaf; > + int num_pgs; > + int i = 0; > + int ext_data_sz; > + u32 acc_flags; > + u32 desc_size; > + int pkt_type; > + > + /* > + * The user address is passed in via a two level structure. > + * An Array of struct hv_page_buffer will be used to describe > + * the user memory. The pages containing this array will be descibed > + * in another array of struct hv_page_buffer. We pass this seconed level > + * array to the host. > + */ > + > + hvnd_debug("ib_umem_page_count(mr->umem)=%d\n", > + ib_umem_page_count(mr->umem)); > + > + sz_leaf = ib_umem_page_count(mr->umem) * sizeof(u64) + > + sizeof(struct hv_mpb_array); > + > + pb = (struct hv_mpb_array *) __get_free_pages(GFP_KERNEL|__GFP_ZERO, > + get_order(sz_leaf)); > + > + if (pb == NULL) > + return -ENOMEM; > + /* > + * Allocate an array of hv_page_buffer to describe the first level. > + */ > + num_pgs = DIV_ROUND_UP(sz_leaf, PAGE_SIZE); > + hvnd_debug("num pages in the top array is %d\n", num_pgs); > + > + desc_size = (num_pgs * sizeof(u64) + > + sizeof(struct vmbus_packet_mpb_array)); > + tpb = (struct vmbus_packet_mpb_array *) > + __get_free_pages(GFP_KERNEL|__GFP_ZERO, get_order(desc_size)); > + > + if (tpb == NULL) { > + free_pages((unsigned long)pb, get_order(sz_leaf)); > + return -ENOMEM; > + } > + > + hvnd_debug("sz leaf: %d; pgs in top %d\n", sz_leaf, num_pgs); > + > + /* > + * Now fill the leaf level array. > + */ > + pb->len = mr->length; > + pb->offset = offset_in_page(mr->start); > + user_va_init_pfn(pb->pfn_array, mr->umem); > + > + /* > + * Now fill out the top level array. > + */ > + for (i = 0; i < num_pgs; i++) { > + tpb->range.pfn_array[i] = virt_to_phys( > + (u8 *)pb + (PAGE_SIZE * i)) >> PAGE_SHIFT; > + hvnd_debug("virtual address = %p\n", > + (u8 *)pb + (PAGE_SIZE * i)); > + hvnd_debug("physical address = %llx\n", > + virt_to_phys((u8 *)pb + (PAGE_SIZE * i))); > + hvnd_debug("tpb->range.pfn_array[%d]=%llx\n", i, > + tpb->range.pfn_array[i]); > + } > + > + tpb->range.offset = 8; > + tpb->range.len = ib_umem_page_count(mr->umem) * sizeof(u64); > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + ext_data_sz = (ib_umem_page_count(mr->umem) * sizeof(u64)); > + acc_flags = ND_MR_FLAG_DO_NOT_SECURE_VM | hvnd_convert_access(mr->acc); > + hvnd_debug("memory register access flags are: %x\n", acc_flags); > + > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_MR_REGISTER, 0, 0, 0); > + > + /* > + * The memory registration call uses a different mechanism to pass > + * pfn information. > + */ > + > + pkt_type = pkt.hdr.pkt_hdr.packet_type; > + NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTION_EXTERNAL_DATA); > + pkt.hdr.pkt_hdr.packet_type = pkt_type; > + > + pkt.hdr.extended_data.size = ext_data_sz; > + pkt.hdr.extended_data.offset = 0; > + /* > + * Now fill out the ioctl. > + */ > + > + pkt.ioctl.in.header.version = ND_VERSION_1; > + pkt.ioctl.in.header.flags = acc_flags; > + pkt.ioctl.in.header.cb_length = mr->length; > + pkt.ioctl.in.header.target_addr = mr->virt; > + pkt.ioctl.in.header.mr_handle = mr->mr_handle; > + pkt.ioctl.in.address = mr->virt; > + > + /* > + * Now send the packet to the host. > + */ > + > + ret = hvnd_send_pgbuf_ioctl_pkt(nd_dev, > + tpb, desc_size, > + &pkt.hdr, > + sizeof(pkt), > + (unsigned long)&pkt); > + > + if (ret) > + goto err; > + > + hvnd_info("MR REGISTRATION SUCCESS\n"); > + /* > + * Copy the mr registration data. > + */ > + hvnd_debug("mr registration lkey %x\n", pkt.ioctl.out.lkey); > + hvnd_debug("mr registration rkey %x\n", pkt.ioctl.out.rkey); > + > + mr->mr_lkey = pkt.ioctl.out.lkey; > + mr->mr_rkey = pkt.ioctl.out.rkey; > + > + mr->ibmr.lkey = mr->mr_lkey; > + mr->ibmr.rkey = be32_to_cpu(mr->mr_rkey); > + hvnd_debug("ibmr registration lkey %x\n", mr->ibmr.lkey); > + hvnd_debug("ibmr registration rkey %x\n", mr->ibmr.rkey); > + > + free_pages((unsigned long)pb, get_order(sz_leaf)); > + free_pages((unsigned long)tpb, get_order(desc_size)); > + > + return 0; > + > +err: > + free_pages((unsigned long)pb, get_order(sz_leaf)); > + free_pages((unsigned long)tpb, get_order(desc_size)); > + > + hvnd_error("mr register failed: %d\n", ret); > + return ret; > +} > + > +/* > + * Listener operations. > + */ > +int hvnd_cr_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 *listener_handle) > +{ > + struct pkt_nd_cr_listener pkt; > + int ret; > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_LISTENER_CREATE, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.hdr.version = ND_VERSION_1; > + pkt.ioctl.in.hdr.handle = uctx->adaptor_hdl; > + hvnd_debug("Adaptor handle is %p\n", (void *)uctx->adaptor_hdl); > + pkt.ioctl.in.hdr.reserved = 0; > + pkt.ioctl.in.to_semantics = false; > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + /* > + * Copy the listener handle. > + */ > + hvnd_debug("listener handle is %p\n", (void *)pkt.ioctl.out); > + *listener_handle = pkt.ioctl.out; > + > + return 0; > + > +err: > + hvnd_error("create listener failed: ret=%d uctx=%p adaptor handle=%llu\n", > + ret, uctx, uctx->adaptor_hdl); > + return ret; > + > +} > + > +int hvnd_free_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 listener_handle) > +{ > + struct pkt_nd_free_listener pkt; > + int ret; > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_LISTENER_FREE, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.version = ND_VERSION_1; > + pkt.ioctl.in.handle = listener_handle; > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + return 0; > + > +err: > + hvnd_error("free listener failed: %d\n", ret); > + return ret; > +} > + > +int hvnd_bind_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 listener_handle, union nd_sockaddr_inet *addr) > +{ > + struct pkt_nd_bind_listener pkt; > + kuid_t uid = current_uid(); > + int ret; > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_LISTENER_BIND, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.hdr.version = ND_VERSION_1; > + pkt.ioctl.in.hdr.handle = listener_handle; > + pkt.ioctl.in.hdr.reserved = 0; > + > + pkt.ioctl.in.authentication_id = (u32)uid.val; > + pkt.ioctl.in.is_admin = false; > + > + memcpy(&pkt.ioctl.in.hdr.address, addr, sizeof(*addr)); > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + return 0; > + > +err: > + hvnd_error("bind listener failed: %d\n", ret); > + return ret; > +} > + > +int hvnd_listen_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 listener_handle, u32 backlog) > +{ > + struct pkt_nd_listen_listener pkt; > + int ret; > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_LISTENER_LISTEN, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.version = ND_VERSION_1; > + pkt.ioctl.in.listener_handle = listener_handle; > + pkt.ioctl.in.back_log = backlog; > + > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + return 0; > + > +err: > + hvnd_error("listen listener failed: %d\n", ret); > + return ret; > +} > + > +int hvnd_get_addr_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 listener_handle, union nd_sockaddr_inet *addr) > +{ > + struct pkt_nd_get_addr_listener pkt; > + int ret; > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_LISTENER_GET_ADDRESS, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.version = ND_VERSION_1; > + pkt.ioctl.in.handle = listener_handle; > + > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + /* > + * Copy the adddress. > + */ > + > + memcpy(addr, &pkt.ioctl.out, sizeof(union nd_sockaddr_inet)); > + > + return 0; > + > +err: > + hvnd_error("listen listener failed: %d\n", ret); > + return ret; > +} > + > +int hvnd_get_connection_listener(struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx, > + u64 listener_handle, u64 connector_handle, > + u64 irp_handle) > +{ > + struct pkt_nd_get_connection_listener pkt; > + int ret; > + union ndv_context_handle irp_fhandle; > + > + ret = get_irp_handle(nd_dev, &irp_fhandle.local, (void *)irp_handle); > + > + if (ret) { > + hvnd_error("get_irp_handle() failed: err: %d\n", ret); > + return ret; > + } > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST, 0, 0, > + irp_fhandle.val64); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.version = ND_VERSION_1; > + pkt.ioctl.in.listener_handle = listener_handle; > + pkt.ioctl.in.connector_handle = connector_handle; > + > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + return 0; > + > +err: > + hvnd_error("get connection listener failed: %d\n", ret); > + return ret; > +} > + > +/* > + * Connector APIs. > + */ > + > +int hvnd_cr_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 *connector_handle) > +{ > + struct pkt_nd_cr_connector pkt; > + int ret; > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(struct pkt_nd_cr_listener) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_CONNECTOR_CREATE, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.hdr.version = ND_VERSION_1; > + pkt.ioctl.in.hdr.handle = uctx->adaptor_hdl; > + pkt.ioctl.in.to_semantics = false; > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + /* > + * Copy the listener handle. > + */ > + hvnd_debug("connector handle is %p\n", (void *)pkt.ioctl.out); > + *connector_handle = pkt.ioctl.out; > + > + return 0; > + > +err: > + return ret; > +} > + > +int hvnd_free_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 handle) > +{ > + struct pkt_nd_free_connector pkt; > + int ret; > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_CONNECTOR_FREE, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.version = ND_VERSION_1; > + pkt.ioctl.in.handle = handle; > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + return 0; > + > +err: > + return ret; > +} > + > +int hvnd_bind_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 handle, union nd_sockaddr_inet *addr) > +{ > + struct pkt_nd_bind_connector pkt; > + int ret; > + kuid_t uid = current_uid(); > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_CONNECTOR_BIND, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.hdr.version = ND_VERSION_1; > + pkt.ioctl.in.hdr.handle = handle; > + > + memcpy(&pkt.ioctl.in.hdr.address, addr, sizeof(*addr)); > + > + pkt.ioctl.in.authentication_id = (u32)uid.val; > + pkt.ioctl.in.is_admin = false; > + > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + return 0; > + > +err: > + return ret; > +} > + > +int hvnd_connector_connect(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 connector_handle, u32 in_rd_limit, u32 out_rd_limit, > + u32 priv_data_length, const u8 *priv_data, > + u64 qp_handle, struct if_physical_addr *phys_addr, > + union nd_sockaddr_inet *dest_addr, > + struct hvnd_ep_obj *ep) > +{ > + struct pkt_nd_connector_connect *pkt = &ep->connector_connect_pkt; > + int ret; > + union ndv_context_handle irp_fhandle; > + > + hvnd_debug("local irp is %d\n", ep->local_irp); > + irp_fhandle.local = ep->local_irp; > + > + if (priv_data_length > MAX_PRIVATE_DATA_LEN) { > + hvnd_error("priv_data_length=%d\n", priv_data_length); > + return -EINVAL; > + } > + > + // KYS try to avoid having to zero everything > + memset(pkt, 0, sizeof(*pkt)); > + hvnd_init_hdr(&pkt->hdr, > + sizeof(*pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_CONNECTOR_CONNECT, 0, 0, irp_fhandle.val64); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt->ioctl.in.hdr.version = ND_VERSION_1; > + pkt->ioctl.in.hdr.connector_handle = connector_handle; > + pkt->ioctl.in.hdr.read_limits.inbound = in_rd_limit; > + pkt->ioctl.in.hdr.read_limits.outbound = out_rd_limit; > + pkt->ioctl.in.hdr.cb_private_data_length = priv_data_length; > + pkt->ioctl.in.hdr.cb_private_data_offset = > + offsetof(union connector_connect_ioctl, in.priv_data); > + pkt->ioctl.in.hdr.qp_handle = qp_handle; > + > + memcpy(&pkt->ioctl.in.hdr.phys_addr, phys_addr, > + sizeof(struct if_physical_addr)); > + > + /* > + * Luke's code does not copy the ip address. > + */ > + memcpy(&pkt->ioctl.in.hdr.destination_address, dest_addr, > + sizeof(union nd_sockaddr_inet)); > + > + pkt->ioctl.in.retry_cnt = 7; > + pkt->ioctl.in.rnr_retry_cnt = 7; > + memcpy(pkt->ioctl.in.priv_data, priv_data, priv_data_length); > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt->hdr, sizeof(*pkt), (u64)pkt); > + > + if (ret) > + goto err; > + > + return 0; > + > +err: > + return ret; > +} > + > +int hvnd_connector_complete_connect(struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx, > + u64 connector_handle, > + enum ibv_qp_state *qp_state) > +{ > + struct pkt_nd_connector_connect_complete pkt; > + int ret; > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_CONNECTOR_COMPLETE_CONNECT, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.hdr.version = ND_VERSION_1; > + pkt.ioctl.in.hdr.handle = connector_handle; > + pkt.ioctl.in.rnr_nak_to = 0; > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + *qp_state = pkt.ioctl.out.state; > + return 0; > + > +err: > + return ret; > +} > + > +int hvnd_connector_accept(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 connector_handle, > + u64 qp_handle, > + u32 in_rd_limit, u32 out_rd_limit, > + u32 priv_data_length, const u8 *priv_data, > + enum ibv_qp_state *qp_state, struct hvnd_ep_obj *ep) > +{ > + struct pkt_nd_connector_accept pkt; > + int ret; > + union ndv_context_handle irp_fhandle; > + > + irp_fhandle.local = ep->local_irp; > + > + if (priv_data_length > MAX_PRIVATE_DATA_LEN) { > + hvnd_error("priv_data_length=%d\n", priv_data_length); > + return -EINVAL; > + } > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_CONNECTOR_ACCEPT, 0, 0, irp_fhandle.val64); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.hdr.version = ND_VERSION_1; > + pkt.ioctl.in.hdr.reserved = 0; > + pkt.ioctl.in.hdr.read_limits.inbound = in_rd_limit; > + pkt.ioctl.in.hdr.read_limits.outbound = out_rd_limit; > + pkt.ioctl.in.hdr.cb_private_data_length = priv_data_length; > + > + pkt.ioctl.in.hdr.cb_private_data_offset = > + offsetof(struct connector_accept_in, private_data); > + > + pkt.ioctl.in.hdr.connector_handle = connector_handle; > + pkt.ioctl.in.hdr.qp_handle = qp_handle; > + > + pkt.ioctl.in.rnr_nak_to = 0; > + pkt.ioctl.in.rnr_retry_cnt = 7; > + > + > + memcpy(pkt.ioctl.in.private_data, priv_data, priv_data_length); > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + *qp_state = pkt.ioctl.out.state; > + return 0; > + > +err: > + return ret; > +} > + > +int hvnd_connector_reject(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 connector_handle, > + u32 priv_data_length, u8 *priv_data, > + enum ibv_qp_state *qp_state) > +{ > + struct pkt_nd_connector_reject pkt; > + int ret; > + > + if (priv_data_length > MAX_PRIVATE_DATA_LEN) { > + hvnd_error("priv_data_length=%d\n", priv_data_length); > + return -EINVAL; > + } > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_CONNECTOR_REJECT, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.hdr.version = ND_VERSION_1; > + pkt.ioctl.in.hdr.reserved = 0; > + pkt.ioctl.in.hdr.cb_private_data_length = priv_data_length; > + > + pkt.ioctl.in.hdr.cb_private_data_offset = > + offsetof(struct connector_reject_in, private_data); > + > + pkt.ioctl.in.hdr.connector_handle = connector_handle; > + > + memcpy(pkt.ioctl.in.private_data, priv_data, priv_data_length); > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + *qp_state = pkt.ioctl.out.state; > + return 0; > + > +err: > + return ret; > +} > + > +int hvnd_connector_get_rd_limits(struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx, > + u64 connector_handle, > + struct nd_read_limits *rd_limits) > +{ > + struct pkt_nd_connector_get_rd_limits pkt; > + int ret; > + > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_CONNECTOR_GET_READ_LIMITS, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.in.version = ND_VERSION_1; > + pkt.ioctl.in.in.reserved = 0; > + pkt.ioctl.in.in.handle = connector_handle; > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + *rd_limits = pkt.ioctl.out.out; > + return 0; > + > +err: > + return ret; > +} > + > +int hvnd_connector_get_priv_data(struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx, > + u64 connector_handle, > + u8 *priv_data) > +{ > + struct pkt_nd_connector_get_priv_data pkt; > + int ret; > + > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.version = ND_VERSION_1; > + pkt.ioctl.in.reserved = 0; > + pkt.ioctl.in.handle = connector_handle; > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + memcpy(priv_data, pkt.ioctl.out, MAX_PRIVATE_DATA_LEN); > + return 0; > + > +err: > + return ret; > +} > + > +int hvnd_connector_get_peer_addr(struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx, > + u64 connector_handle, > + union nd_sockaddr_inet *peer_addr) > +{ > + struct pkt_nd_connector_get_peer_addr pkt; > + int ret; > + > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.version = ND_VERSION_1; > + pkt.ioctl.in.reserved = 0; > + pkt.ioctl.in.handle = connector_handle; > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + memcpy(peer_addr, &pkt.ioctl.out, sizeof(union nd_sockaddr_inet)); > + return 0; > + > +err: > + return ret; > +} > + > +int hvnd_connector_get_local_addr(struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx, > + u64 connector_handle, > + union nd_sockaddr_inet *addr) > +{ > + struct pkt_nd_connector_get_addr pkt; > + int ret; > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_CONNECTOR_GET_ADDRESS, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.version = ND_VERSION_1; > + pkt.ioctl.in.reserved = 0; > + pkt.ioctl.in.handle = connector_handle; > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + memcpy(addr, &pkt.ioctl.out, sizeof(union nd_sockaddr_inet)); > + return 0; > + > +err: > + return ret; > +} > + > + > +int hvnd_connector_notify_disconnect(struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx, > + u64 connector_handle, struct hvnd_ep_obj *ep) > +{ > + struct pkt_nd_connector_notify_disconnect pkt; > + int ret; > + union ndv_context_handle irp_fhandle; > + > + irp_fhandle .local = ep->local_irp; > + > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT, 0, 0, > + irp_fhandle.val64); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.version = ND_VERSION_1; > + pkt.ioctl.in.reserved = 0; > + pkt.ioctl.in.handle = connector_handle; > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + return 0; > + > +err: > + return ret; > +} > + > + > +//ASYNCH call > +int hvnd_connector_disconnect(struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx, > + u64 connector_handle, struct hvnd_ep_obj *ep) > +{ > + struct pkt_nd_connector_disconnect pkt; > + int ret; > + union ndv_context_handle irp_fhandle; > + > + irp_fhandle.local = ep->local_irp; > + > + > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_CONNECTOR_DISCONNECT, 0, 0, irp_fhandle.val64); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.version = ND_VERSION_1; > + pkt.ioctl.in.reserved = 0; > + pkt.ioctl.in.handle = connector_handle; > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + return 0; > + > +err: > + return ret; > +} > + > +/* > + * QP operations. > + */ > +int hvnd_create_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + struct hvnd_qp *qp) > +{ > + int ret; > + struct pkt_nd_create_qp *pkt; > + int num_pfn, num_db_pfn; > + int qp_pkt_size; > + unsigned int offset; > + u32 ext_data_offset; > + u32 ext_data_size; > + > + /* > + * Now create QP. > + * First compute the number of PFNs we need to accomodate: > + * One each for door bell and arm_sn and pages in cq buffer. > + */ > + offset = offset_in_page(qp->qp_buf); > + num_pfn = DIV_ROUND_UP(offset + qp->buf_size, PAGE_SIZE); > + > + offset = offset_in_page(qp->db_addr); > + num_db_pfn = DIV_ROUND_UP(offset + 4, PAGE_SIZE); > + > + qp_pkt_size = sizeof(struct pkt_nd_create_qp) + > + (num_pfn * sizeof(u64)); > + > + hvnd_debug("CREATE QP, num pfns is %d\n", num_pfn); > + hvnd_debug("CREATE QP, num DB pfns is %d\n", num_db_pfn); > + > + pkt = kzalloc(qp_pkt_size, GFP_KERNEL); > + > + if (!pkt) > + return -ENOMEM; > + > + hvnd_debug("offset of nd_create_qp is %d\n", > + (int)offsetof(struct pkt_nd_create_qp, ioctl.input)); > + > + ext_data_offset = offsetof(struct pkt_nd_create_qp, ext_data) - > + sizeof(struct ndv_packet_hdr_control_1); > + > + ext_data_size = sizeof(struct create_qp_ext_data) + > + (num_pfn * sizeof(u64)); > + > + hvnd_init_hdr(&pkt->hdr, > + qp_pkt_size - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_QP_CREATE, > + ext_data_size, > + ext_data_offset, > + 0); > + > + /* > + * Now fill out the ioctl section. > + */ > + > + pkt->ioctl.input.hdr.version = ND_VERSION_1; > + > + if (qp->max_inline_data > nd_dev->query_pkt.ioctl.ad_info.inline_request_threshold) > + qp->max_inline_data = nd_dev->query_pkt.ioctl.ad_info.inline_request_threshold; > + pkt->ioctl.input.hdr.cb_max_inline_data = qp->max_inline_data; > + > + hvnd_debug("pkt->ioctl.input.hdr.cb_max_inline_data=%d\n", > + pkt->ioctl.input.hdr.cb_max_inline_data); > + > + pkt->ioctl.input.hdr.ce_mapping_cnt = MLX4_IB_CREATE_QP_MAPPINGS_MAX; > + pkt->ioctl.input.hdr.cb_mapping_offset = sizeof(union create_qp_ioctl); > + > + pkt->ioctl.input.hdr.initiator_queue_depth = qp->initiator_q_depth; > + pkt->ioctl.input.hdr.max_initiator_request_sge = > + qp->initiator_request_sge; > + > + hvnd_debug("recv cq handle is %p\n", (void *)qp->receive_cq_handle); > + hvnd_debug("send cq handle is %p\n", (void *)qp->initiator_cq_handle); > + hvnd_debug("pd handle is %p\n", (void *)qp->pd_handle); > + pkt->ioctl.input.hdr.receive_cq_handle = qp->receive_cq_handle; > + pkt->ioctl.input.hdr.initiator_cq_handle = qp->initiator_cq_handle; > + pkt->ioctl.input.hdr.pd_handle = qp->pd_handle; > + > + > + hvnd_debug("ce_mapping cnt is %d\n", > + pkt->ioctl.input.hdr.ce_mapping_cnt); > + hvnd_debug("cb_mapping offset is %d\n", > + pkt->ioctl.input.hdr.cb_mapping_offset); > + > + pkt->ioctl.input.receive_queue_depth = qp->receive_q_depth; > + pkt->ioctl.input.max_receive_request_sge = qp->receive_request_sge; > + > + > + pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_BUF].map_memory.map_type = ND_MAP_MEMORY; > + pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_BUF].map_memory.access_type = ND_MODIFY_ACCESS; > + pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_BUF].map_memory.address = (u64)qp->qp_buf; > + pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_BUF].map_memory.cb_length = qp->buf_size; > + > + pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_DB].map_memory.map_type = ND_MAP_MEMORY_COALLESCE; > + pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_DB].map_memory.access_type = ND_WRITE_ACCESS; > + pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_DB].map_memory.address = (u64)qp->db_addr; > + pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_DB].map_memory.cb_length = 4; > + > + pkt->mappings.qp_in.log_sq_bb_count = qp->log_sq_bb_count; > + pkt->mappings.qp_in.log_sq_stride = qp->log_sq_stride; > + pkt->mappings.qp_in.sq_no_prefetch = qp->sq_no_prefetch; > + > + > + /* > + * Fill in the extended data. > + */ > + > + pkt->ext_data.cnt = 2; > + pkt->ext_data.fields[MLX4_IB_CREATE_QP_BUF].size = > + sizeof(struct gpa_range) + (num_pfn * sizeof(u64)); > + pkt->ext_data.fields[MLX4_IB_CREATE_QP_BUF].offset = > + offsetof(struct create_qp_ext_data, qpbuf_gpa); > + > + pkt->ext_data.fields[MLX4_IB_CREATE_QP_DB].size = > + sizeof(struct qp_db_gpa); > + pkt->ext_data.fields[MLX4_IB_CREATE_QP_DB].offset = > + offsetof(struct create_qp_ext_data, db_gpa); > + > + /* > + * Fill up the gpa range for qp buffer. > + */ > + > + pkt->ext_data.db_gpa.byte_count = 4; // KYS 8 or 16? > + pkt->ext_data.db_gpa.byte_offset = offset_in_page(qp->db_addr); > + user_va_init_pfn(&pkt->ext_data.db_gpa.pfn_array[0], qp->db_umem); > + > + pkt->ext_data.qpbuf_gpa.byte_count = qp->buf_size; > + pkt->ext_data.qpbuf_gpa.byte_offset = offset_in_page(qp->qp_buf); > + user_va_init_pfn(&pkt->ext_data.qpbuf_gpa.pfn_array[0], qp->umem); > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt->hdr, qp_pkt_size, (u64)pkt); > + > + if (ret) > + goto cr_qp_err; > + > + /* > + * Copy the necessary response from the host. > + */ > + qp->qp_handle = pkt->ioctl.resrc_desc.handle; > + > + qp->qpn = pkt->mappings.qp_resp.qpn; > + qp->max_send_wr = pkt->mappings.qp_resp.max_send_wr; > + qp->max_recv_wr = pkt->mappings.qp_resp.max_recv_wr; > + qp->max_send_sge = pkt->mappings.qp_resp.max_send_sge; > + qp->max_recv_sge = pkt->mappings.qp_resp.max_recv_sge; > + > + > + hvnd_debug("qp->max_send_wr=%d max_recv_wr=%d max_send_sge=%d max_recv_sge=%d max_inline_data=%d\n", > + qp->max_send_wr, qp->max_recv_wr, qp->max_send_sge, > + qp->max_recv_sge, qp->max_inline_data); > + > + ret = insert_handle(nd_dev, &nd_dev->qpidr, qp, qp->qpn); > + > + if (ret) > + goto cr_qp_err; > + > + hvnd_debug("QP create after success qpn:%d qp:%p handle:%llu\n", > + qp->qpn, qp, qp->qp_handle); > + > +cr_qp_err: > + kfree(pkt); > + return ret; > +} > + > +int hvnd_free_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + struct hvnd_qp *qp) > +{ > + int ret; > + > + ret = hvnd_free_handle(nd_dev, uctx, qp->qp_handle, IOCTL_ND_QP_FREE); > + if (ret == 0) > + remove_handle(nd_dev, &nd_dev->qpidr, qp->qpn); > + return ret; > +} > + > +int hvnd_flush_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + struct hvnd_qp *qp) > +{ > + struct pkt_nd_flush_qp pkt; > + int ret; > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + IOCTL_ND_QP_FLUSH, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.version = ND_VERSION_1; > + pkt.ioctl.in.reserved = 0; > + pkt.ioctl.in.handle = qp->qp_handle; > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + return 0; > + > +err: > + return ret; > +} > + > + > +int hvnd_bind_nic(struct hvnd_dev *nd_dev, bool un_bind, char *ip_addr, > + char *mac_addr) > +{ > + int ret; > + int pkt_type = NDV_PKT_ID1_BIND; > + > + /* > + * Send the bind information over to the host. > + * For now, we will have a single ip and MAC address that we > + * will deal with. Down the road we will need to expand support > + * for multiple IP and MAC addresses and also deal with changing > + * IP addresses. > + */ > + > + NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); > + hvnd_debug("bind packet type is %d ID:%d\n", > + pkt_type, NDV_PACKET_TYPE_ID(pkt_type)); > + nd_dev->bind_pkt.pkt_hdr.packet_type = pkt_type; > + > + nd_dev->bind_pkt.pkt_hdr.hdr_sz = sizeof(struct ndv_pkt_hdr_bind_1); > + hvnd_debug("bind packet size is %d\n", > + (int)sizeof(struct ndv_pkt_hdr_bind_1)); > + nd_dev->bind_pkt.pkt_hdr.data_sz = 0; > + nd_dev->bind_pkt.unbind = un_bind; > + nd_dev->bind_pkt.ip_address.address_family = AF_INET; > + nd_dev->bind_pkt.ip_address.ipv4.sin_family = AF_INET; > + nd_dev->bind_pkt.ip_address.ipv4.sin_port = 0; > + nd_dev->bind_pkt.ip_address.ipv4.sin_addr.s_addr = *(unsigned int *)ip_addr; > + > + nd_dev->bind_pkt.phys_addr.length = ETH_ALEN; > + memcpy(nd_dev->bind_pkt.phys_addr.addr, mac_addr, ETH_ALEN); > + > + /* > + * This is the adapter handle; needs to be unique for each > + * MAC, ip address tuple. > + */ > + nd_dev->bind_pkt.guest_id = (u64)nd_dev; > + > + ret = hvnd_send_packet(nd_dev, &nd_dev->bind_pkt, > + sizeof(struct ndv_pkt_hdr_bind_1), > + (u64)NULL, > + true); > + return ret; > +} > + > +int hvnd_init_resources(struct hvnd_dev *nd_dev) > +{ > + unsigned long mmio_sz; > + struct resource *resrc; > + int ret = -ENOMEM; > + > + resrc = &iomem_resource; > + > + mmio_sz = nd_dev->hvdev->channel->offermsg.offer.mmio_megabytes * > + 1024 * 1024; > + nd_dev->mmio_sz = mmio_sz; > + nd_dev->mmio_resource.name = KBUILD_MODNAME; > + nd_dev->mmio_resource.flags = IORESOURCE_MEM | IORESOURCE_BUSY; > + > + ret = allocate_resource(resrc, &nd_dev->mmio_resource, > + mmio_sz, 0, -1, mmio_sz, NULL, NULL); > + > + if (ret) { > + hvnd_error("Unable to allocate mmio resources\n"); > + return ret; > + } > + hvnd_debug("MMIO start is %p\n", (void *)nd_dev->mmio_resource.start); > + > + /* > + * Send the mmio information over to the host. > + */ > + nd_dev->resources.pkt_hdr.packet_type = NDV_PKT_ID1_INIT_RESOURCES; > + nd_dev->resources.pkt_hdr.hdr_sz = sizeof(union ndv_packet_hdr); > + nd_dev->resources.pkt_hdr.data_sz = 0; > + > + nd_dev->resources.io_space_sz_mb = mmio_sz; > + nd_dev->resources.io_space_start = nd_dev->mmio_resource.start; > + > + ret = hvnd_send_packet(nd_dev, &nd_dev->resources, > + sizeof(struct ndv_pkt_hdr_init_resources_1), > + (u64)NULL, > + true); > + return ret; > +} > + > +int hvnd_query_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx) > +{ > + struct pkt_nd_query_adaptor *pkt; > + int ret; > + int pkt_type; > + struct ib_device_attr *props = &nd_dev->ibdev.attrs; > + struct adapter_info_v2 *adap_info = &nd_dev->query_pkt.ioctl.ad_info; > + > + hvnd_debug("Performing Adapter query nd_dev=%p\n", nd_dev); > + > + // check if there is a need to do query > + if (nd_dev->query_pkt_set) > + return 0; > + > + // need a lock, multiple process can call this at the same time > + > + down(&nd_dev->query_pkt_sem); > + if (nd_dev->query_pkt_set) { > + up(&nd_dev->query_pkt_sem); > + return 0; > + } > + > + /* > + * Now query the adaptor. > + */ > + > + pkt = &nd_dev->query_pkt; > + > + pkt_type = NDV_PKT_ID1_CONTROL; > + NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); > + > + pkt->hdr.pkt_hdr.packet_type = pkt_type; > + pkt->hdr.pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); > + pkt->hdr.pkt_hdr.data_sz = sizeof(struct pkt_nd_query_adaptor) - > + sizeof(struct ndv_packet_hdr_control_1); > + > + > + pkt->hdr.file_handle.local = uctx->file_handle.local; > + pkt->hdr.file_handle.remote = uctx->file_handle.remote; > + > + pkt->hdr.irp_handle.val64 = 0; > + > + pkt->hdr.io_cntrl_code = IOCTL_ND_ADAPTER_QUERY; > + pkt->hdr.output_buf_sz = sizeof(struct nd_adap_query_ioctl); > + pkt->hdr.input_buf_sz = sizeof(struct nd_adap_query_ioctl); > + pkt->hdr.input_output_buf_offset = 0; > + memset(&pkt->ioctl.ad_q, 0, sizeof(struct nd_adap_query_ioctl)); > + > + pkt->ioctl.ad_q.version = ND_VERSION_1; > + pkt->ioctl.ad_q.info_version = ND_VERSION_2; > + pkt->ioctl.ad_q.adapter_handle = uctx->adaptor_hdl; > + > + ret = hvnd_send_packet(nd_dev, pkt, > + sizeof(struct pkt_nd_query_adaptor), > + (unsigned long)pkt, true); > + > + hvnd_debug("pkt->ioctl.ad_info.inline_request_threshold=%d\n", > + pkt->ioctl.ad_info.inline_request_threshold); > + nd_dev->query_pkt_set = true; > + > + // how about host returning PENDING > + up(&nd_dev->query_pkt_sem); > + > + if (ret) > + return ret; > + > + hvnd_debug("Query Adaptor Succeeded\n"); > + > + /* > + * Cache the relevant properties out. > + */ > + props->fw_ver = 0; > + props->device_cap_flags = 0; > + props->vendor_id = 0x15b3; > + props->vendor_part_id = adap_info->device_id; > + props->max_mr_size = ~0ull; > + props->page_size_cap = PAGE_SIZE; > + props->max_qp = 16384; > + props->max_qp_wr = min(adap_info->max_recv_q_depth, > + adap_info->max_initiator_q_depth); > + props->max_sge = min(adap_info->max_initiator_sge, > + adap_info->max_recv_sge); > + props->max_cq = 0x1FFFF; > + props->max_cqe = adap_info->max_completion_q_depth; > + props->max_mr = 16384; > + props->max_pd = 16384; > + props->max_qp_rd_atom = adap_info->max_inbound_read_limit; > + props->max_qp_init_rd_atom = adap_info->max_outbound_read_limit; > + props->max_res_rd_atom = props->max_qp_rd_atom * props->max_qp; > + props->max_srq = 16384; > + props->max_srq_wr = adap_info->max_recv_q_depth; > + props->max_srq_sge = adap_info->max_recv_sge; > + return 0; > +} > + > + > +int hvnd_create_pd(struct hvnd_ucontext *uctx, struct hvnd_dev *nd_dev, > + struct hvnd_ib_pd *hvnd_pd) > +{ > + struct pkt_nd_pd_create *pkt = &uctx->pd_cr_pkt; > + int ret; > + int pkt_type; > + > + hvnd_debug("Create Protection Domain\n"); > + > + pkt_type = NDV_PKT_ID1_CONTROL; > + NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); > + > + pkt->hdr.pkt_hdr.packet_type = pkt_type; > + pkt->hdr.pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); > + pkt->hdr.pkt_hdr.data_sz = sizeof(struct pkt_nd_pd_create) - > + sizeof(struct ndv_packet_hdr_control_1); > + > + hvnd_debug("pdcreate packet size: %d\n", > + (int)sizeof(struct pkt_nd_pd_create)); > + hvnd_debug("pdcreate hdr size: %d\n", > + (int)sizeof(struct ndv_packet_hdr_control_1)); > + hvnd_debug("pdcreate data size: %d\n", pkt->hdr.pkt_hdr.data_sz); > + > + pkt->hdr.file_handle.local = uctx->create_pkt.handle.local; > + pkt->hdr.file_handle.remote = uctx->create_pkt.handle.remote; > + > + hvnd_debug("create pd uctx is %p\n", uctx); > + hvnd_debug("create pd local file is %d\n", > + uctx->create_pkt.handle.local); > + hvnd_debug("create pd local file is %d\n", > + uctx->create_pkt.handle.remote); > + > + pkt->hdr.irp_handle.val64 = 0; > + pkt->hdr.io_cntrl_code = IOCTL_ND_PD_CREATE; > + > + pkt->hdr.output_buf_sz = sizeof(struct nd_create_pd_ioctl); > + pkt->hdr.input_buf_sz = sizeof(struct nd_create_pd_ioctl); > + pkt->hdr.input_output_buf_offset = 0; > + > + hvnd_debug("output/input buf size: %d\n", pkt->hdr.output_buf_sz); > + > + /* > + * Fill the ioctl section. > + */ > + pkt->ioctl.in.version = ND_VERSION_1; > + pkt->ioctl.in.reserved = 0; > + pkt->ioctl.in.handle = uctx->adaptor_hdl; > + > + > + ret = hvnd_send_packet(nd_dev, pkt, > + sizeof(struct pkt_nd_pd_create), > + (unsigned long)pkt, true); > + > + if (ret) > + return ret; > + > + if (pkt->hdr.pkt_hdr.status != 0) { > + hvnd_error("Create PD failed; status is %d\n", > + pkt->hdr.pkt_hdr.status); > + return -EINVAL; > + } > + if (pkt->hdr.io_status != 0) { > + hvnd_error("Create PD failed;io status is %d\n", > + pkt->hdr.io_status); > + return -EINVAL; > + } > + > + hvnd_debug("Create PD Succeeded\n"); > + > + hvnd_debug("pd_handle is %p\n", (void *)pkt->ioctl.resp.pd_handle); > + hvnd_debug("pdn is %d\n", (int)pkt->ioctl.resp.pdn); > + > + hvnd_pd->pdn = pkt->ioctl.resp.pdn; > + hvnd_pd->handle = pkt->ioctl.out_handle; > + > + return 0; > +} > + > +int hvnd_cancel_io(struct hvnd_ep_obj *ep_object) > +{ > + struct pkt_nd_cancel_io pkt; > + int ret; > + u32 ioctl; > + > + switch (ep_object->type) { > + case ND_LISTENER: > + hvnd_debug("LISTENER I/O Cancelled\n"); > + ioctl = IOCTL_ND_LISTENER_CANCEL_IO; > + break; > + case ND_CONNECTOR: > + hvnd_debug("CONNECTOR I/O Cancelled\n"); > + ioctl = IOCTL_ND_CONNECTOR_CANCEL_IO; > + break; > + case ND_MR: > + hvnd_debug("MR I/O Cancelled\n"); > + ioctl = IOCTL_ND_MR_CANCEL_IO; > + break; > + case ND_CQ: > + hvnd_debug("CQ I/O Cancelled\n"); > + ioctl = IOCTL_ND_CQ_CANCEL_IO; > + break; > + default: > + hvnd_error("UNKNOWN object type\n"); > + return -EINVAL; > + } > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + ep_object->uctx->create_pkt.handle.local, > + ep_object->uctx->create_pkt.handle.remote, > + ioctl, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.version = ND_VERSION_1; > + pkt.ioctl.in.reserved = 0; > + pkt.ioctl.in.handle = ep_object->ep_handle; > + hvnd_debug("cancel io handle is %p\n", (void *)ep_object->ep_handle); > + > + ret = hvnd_send_ioctl_pkt(ep_object->nd_dev, &pkt.hdr, > + sizeof(pkt), > + (u64)&pkt); > + > + if (ret) > + goto err; > + > + /* > + * Now that we have cancelled all I/Os, > + */ > + > + return 0; > + > +err: > + hvnd_error("cancel I/O operation failed\n"); > + return ret; > +} > + > + > +int hvnd_free_handle(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 handle, u32 ioctl) > +{ > + struct pkt_nd_free_handle pkt; > + int ret; > + > + hvnd_debug("Freeing handle ioctl is %s; handle is %p\n", > + hvnd_get_op_name(ioctl), (void *)handle); > + > + hvnd_debug("uctx is %p\n", uctx); > + hvnd_debug("nd_dev is %p\n", nd_dev); > + > + // KYS try to avoid having to zero everything > + memset(&pkt, 0, sizeof(pkt)); > + hvnd_init_hdr(&pkt.hdr, > + sizeof(pkt) - > + sizeof(struct ndv_packet_hdr_control_1), > + uctx->create_pkt.handle.local, > + uctx->create_pkt.handle.remote, > + ioctl, 0, 0, 0); > + > + /* > + * Now fill in the ioctl section. > + */ > + pkt.ioctl.in.version = ND_VERSION_1; > + pkt.ioctl.in.reserved = 0; > + pkt.ioctl.in.handle = handle; > + > + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > + > + if (ret) > + goto err; > + > + return 0; > + > +err: > + hvnd_error("%s: ret=%d\n", __func__, ret); > + return ret; > +} > + > +int hvnd_negotiate_version(struct hvnd_dev *nd_dev) > +{ > + union ndv_packet_init *pkt = &nd_dev->init_pkt; > + int ret; > + > + nd_dev->negotiated_version = NDV_PROTOCOL_VAERSION_INVALID; > + > + pkt->packet_type = NDV_PACKET_TYPE_INIT; > + pkt->protocol_version = NDV_PROTOCOL_VERSION_CURRENT; > + pkt->flags = 0; // KYS are the flags 0? > + > + ret = hvnd_send_packet(nd_dev, pkt, > + sizeof(union ndv_packet_init), (u64)NULL, true); > + > + return ret; > +} > + > +void hvnd_callback(void *context) > +{ > + struct hv_device *dev = context; > + struct hvnd_dev *nd_dev = hv_get_drvdata(dev); > + int copy_sz = 0; > + struct ndv_packet_hdr_control_1 *ctrl_hdr; > + union ndv_packet_init *pkt_init; > + u32 recvlen; > + u32 local_irp; > + u64 requestid; > + u32 *pkt_type; > + u32 pkt_id; > + struct hvnd_ep_obj *ep_object; > + struct incoming_pkt *incoming_pkt; /* Used only for asynch calls */ > + char *incoming_pkt_start; > + struct vmpacket_descriptor *desc; > + int status; > + struct hvnd_cookie *hvnd_cookie; > + unsigned long flags; > + > + vmbus_recvpacket_raw(dev->channel, hvnd_recv_buffer, > + (PAGE_SIZE * 4), &recvlen, &requestid); > + > + if (recvlen <= 0) > + return; > + > + desc = (struct vmpacket_descriptor *)hvnd_recv_buffer; > + incoming_pkt_start = hvnd_recv_buffer + (desc->offset8 << 3); > + recvlen -= desc->offset8 << 3; > + > + pkt_type = (u32 *)incoming_pkt_start; > + pkt_id = *pkt_type; > + if (pkt_id != NDV_PACKET_TYPE_INIT) > + pkt_id = NDV_PACKET_TYPE_ID(pkt_id); > + > + switch (pkt_id) { > + case NDV_PACKET_TYPE_INIT: > + /* > + * Host is responding to our init packet. > + */ > + pkt_init = (union ndv_packet_init *)incoming_pkt_start; > + nd_dev->negotiated_version = pkt_init->protocol_version; > + copy_sz = 0; > + break; > + > + case NDV_PKT_ID1_INIT_RESOURCES: > + copy_sz = 0; > + break; > + > + case NDV_PKT_ID1_BIND: > + nd_dev->bind_pkt.pkt_hdr.status = > + ((union ndv_packet_hdr *) incoming_pkt_start)->status; > + copy_sz = 0; > + break; > + > + case NDV_PKT_ID1_COMPLETE: > + ctrl_hdr = (struct ndv_packet_hdr_control_1 *)incoming_pkt_start; > + status = ctrl_hdr->io_status; > + > + local_irp = ctrl_hdr->irp_handle.local; > + ep_object = (struct hvnd_ep_obj *)map_irp_to_ctx(nd_dev, > + local_irp); > + > + if (!ep_object) { > + hvnd_error("irp could not be mapped; irp is %d ioctl is %s", > + local_irp, > + hvnd_get_op_name(ctrl_hdr->io_cntrl_code)); > + goto complete; > + } > + > + if (ctrl_hdr->io_cntrl_code != IOCTL_ND_CQ_NOTIFY) > + hvnd_debug("completion packet; iostatus is %x, ioctl is %s", > + ctrl_hdr->io_status, > + hvnd_get_op_name(ctrl_hdr->io_cntrl_code)); > + > + switch (ctrl_hdr->io_cntrl_code) { > + > + case IOCTL_ND_CQ_NOTIFY: > + hvnd_process_cq_event_complete(ep_object, status); > + > + ep_del_work_pending(ep_object); > + goto complete; > + > + case IOCTL_ND_CONNECTOR_ACCEPT: > + > + hvnd_process_connector_accept(ep_object, status); > + > + ep_del_work_pending(ep_object); > + goto complete; > + > + case IOCTL_ND_CONNECTOR_DISCONNECT: > + hvnd_debug("disconnected: ep opj is %p; status: %d\n", > + ep_object, status); > + hvnd_process_disconnect(ep_object, status); > + > + ep_del_work_pending(ep_object); > + goto complete; > + > + default: > + break; > + } > + > + /* > + * This is the completion notification; > + * the IRP cookie is the state through which > + * we will invoke the callback. > + */ > + incoming_pkt = kmalloc(recvlen + sizeof(struct incoming_pkt), > + GFP_ATOMIC); > + if (incoming_pkt == NULL) { > + hvnd_error("Could not alloc memory in callback\n"); > + ep_del_work_pending(ep_object); > + goto complete; > + } > + memcpy(incoming_pkt->pkt, incoming_pkt_start, recvlen); > + > + spin_lock_irqsave(&ep_object->incoming_pkt_list_lock, flags); > + list_add_tail(&incoming_pkt->list_entry, > + &ep_object->incoming_pkt_list); > + spin_unlock_irqrestore(&ep_object->incoming_pkt_list_lock, > + flags); > + > + schedule_work(&ep_object->wrk.work); > + > + goto complete; > + > + case NDV_PKT_ID1_CREATE: > + copy_sz = sizeof(struct ndv_pkt_hdr_create_1); > + break; > + > + case NDV_PKT_ID1_CLEANUP: > + copy_sz = sizeof(struct ndv_pkt_hdr_cleanup_1); > + break; > + > + case NDV_PKT_ID1_CONTROL: > + ctrl_hdr = (struct ndv_packet_hdr_control_1 *)incoming_pkt_start; > + status = ctrl_hdr->io_status; > + > + if (ctrl_hdr->io_cntrl_code != IOCTL_ND_CQ_NOTIFY) > + hvnd_debug("packet; iostatus is %x ioctl is %s", > + ctrl_hdr->io_status, hvnd_get_op_name(ctrl_hdr->io_cntrl_code)); > + > + switch (ctrl_hdr->io_cntrl_code) { > + > + case IOCTL_ND_PROVIDER_INIT: > + copy_sz = sizeof(struct pkt_nd_provider_ioctl); > + break; > + > + case IOCTL_ND_PROVIDER_BIND_FILE: > + copy_sz = sizeof(struct pkt_nd_provider_ioctl); > + break; > + > + case IOCTL_ND_ADAPTER_OPEN: > + copy_sz = sizeof(struct pkt_nd_open_adapter); > + break; > + > + case IOCTL_ND_ADAPTER_CLOSE: > + copy_sz = sizeof(struct pkt_nd_free_handle); > + break; > + > + case IOCTL_ND_ADAPTER_QUERY: > + copy_sz = sizeof(struct pkt_nd_query_adaptor); > + break; > + > + case IOCTL_ND_PD_CREATE: > + copy_sz = sizeof(struct pkt_nd_pd_create); > + break; > + > + case IOCTL_ND_PD_FREE: > + copy_sz = sizeof(struct pkt_nd_free_handle); > + break; > + > + case IOCTL_ND_CQ_CREATE: > + copy_sz = sizeof(struct pkt_nd_create_cq); > + break; > + > + case IOCTL_ND_CQ_FREE: > + copy_sz = sizeof(struct pkt_nd_free_cq); > + break; > + > + case IOCTL_ND_CQ_NOTIFY: //FIXME check ep stop state > + local_irp = ctrl_hdr->irp_handle.local; > + ep_object = (struct hvnd_ep_obj *) > + map_irp_to_ctx(nd_dev, local_irp); > + if (!ep_object) { > + hvnd_error("irp could not be mapped\n"); > + goto complete; > + return; > + } > + copy_sz = sizeof(struct pkt_nd_notify_cq); > + hvnd_process_cq_event_pending(ep_object, status); > + goto complete; > + return; > + > + case IOCTL_ND_LISTENER_CREATE: > + copy_sz = sizeof(struct pkt_nd_cr_listener); > + break; > + > + case IOCTL_ND_LISTENER_FREE: > + copy_sz = sizeof(struct pkt_nd_free_listener); > + break; > + > + case IOCTL_ND_QP_FREE: > + copy_sz = sizeof(struct pkt_nd_free_handle); > + break; > + > + case IOCTL_ND_CONNECTOR_CANCEL_IO: > + case IOCTL_ND_MR_CANCEL_IO: > + case IOCTL_ND_CQ_CANCEL_IO: > + case IOCTL_ND_LISTENER_CANCEL_IO: > + copy_sz = sizeof(struct pkt_nd_cancel_io); > + break; > + > + case IOCTL_ND_LISTENER_BIND: > + copy_sz = sizeof(struct pkt_nd_bind_listener); > + break; > + > + case IOCTL_ND_LISTENER_LISTEN: > + copy_sz = sizeof(struct pkt_nd_listen_listener); > + break; > + > + case IOCTL_ND_LISTENER_GET_ADDRESS: > + copy_sz = sizeof(struct pkt_nd_get_addr_listener); > + break; > + > + case IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST: > + copy_sz = sizeof(struct pkt_nd_get_connection_listener); > + goto complete; // non-block > + > + case IOCTL_ND_CONNECTOR_CREATE: > + copy_sz = sizeof(struct pkt_nd_cr_connector); > + break; > + > + case IOCTL_ND_CONNECTOR_FREE: > + copy_sz = sizeof(struct pkt_nd_free_connector); > + break; > + > + case IOCTL_ND_CONNECTOR_BIND: > + copy_sz = sizeof(struct pkt_nd_free_connector); > + break; > + > + // KYS: ALERT: ASYNCH Operation > + case IOCTL_ND_CONNECTOR_CONNECT: > + copy_sz = sizeof(struct pkt_nd_connector_connect); > + goto complete; //non-block > + > + case IOCTL_ND_CONNECTOR_COMPLETE_CONNECT: > + copy_sz = sizeof(struct pkt_nd_connector_connect_complete); > + break; > + > + // KYS: ALERT: ASYNCH Operation > + case IOCTL_ND_CONNECTOR_ACCEPT: > + copy_sz = sizeof(struct pkt_nd_connector_accept); > + goto complete; //non-block > + > + case IOCTL_ND_CONNECTOR_REJECT: > + copy_sz = sizeof(struct pkt_nd_connector_reject); > + break; > + > + case IOCTL_ND_CONNECTOR_GET_READ_LIMITS: > + copy_sz = sizeof(struct pkt_nd_connector_get_rd_limits); > + break; > + > + case IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA: > + copy_sz = sizeof(struct pkt_nd_connector_get_priv_data); > + break; > + > + case IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS: > + copy_sz = sizeof(struct pkt_nd_connector_get_peer_addr); > + break; > + > + case IOCTL_ND_CONNECTOR_GET_ADDRESS: > + copy_sz = sizeof(struct pkt_nd_connector_get_addr); > + break; > + > + // KYS: ALERT: ASYNCH Operation > + case IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT: > + copy_sz = sizeof(struct pkt_nd_connector_notify_disconnect); > + goto complete; //non-block > + > + // KYS: ALERT: ASYNCH Operation > + case IOCTL_ND_CONNECTOR_DISCONNECT: > + hvnd_debug("IOCTL_ND_CONNECTOR_DISCONNECT\n"); > + copy_sz = sizeof(struct pkt_nd_connector_notify_disconnect); > + goto complete; // non-block > + > + case IOCTL_ND_QP_CREATE: > + copy_sz = sizeof(struct pkt_nd_create_qp); > + break; > + > + case IOCTL_ND_MR_CREATE: > + copy_sz = sizeof(struct pkt_nd_create_mr); > + break; > + > + case IOCTL_ND_MR_FREE: > + copy_sz = sizeof(struct pkt_nd_free_handle); > + break; > + > + case IOCTL_ND_MR_REGISTER: > + copy_sz = sizeof(struct pkt_nd_register_mr); > + break; > + > + case IOCTL_ND_MR_DEREGISTER: > + copy_sz = sizeof(struct pkt_nd_deregister_mr); > + break; > + > + case IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST: > + copy_sz = sizeof(struct pkt_query_addr_list); > + break; > + > + case IOCTL_ND_QP_FLUSH: > + copy_sz = sizeof(struct pkt_nd_flush_qp); > + break; > + > + default: > + hvnd_warn("Got unknown ioctl: %d\n", > + ctrl_hdr->io_cntrl_code); > + copy_sz = 0; > + break; > + } > + > + break; > + default: > + hvnd_warn("Got an unknown packet type %d\n", *pkt_type); > + break; > + } > + > + hvnd_cookie = (struct hvnd_cookie *)requestid; > + memcpy(hvnd_cookie->pkt, incoming_pkt_start, copy_sz); > + complete(&hvnd_cookie->host_event); > + > +complete: > + /* send out ioctl completion patcket */ > + if (desc->flags & VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED) { > + int retry = 5; > + > + while (true) { > + int ret; > + > + ret = vmbus_sendpacket(dev->channel, NULL, 0, > + requestid, VM_PKT_COMP, 0); > + if (ret == 0) { > + break; > + } else if (ret == -EAGAIN) { > + if (--retry == 0) { > + hvnd_error("give up retrying send completion packet\n"); > + break; > + } > + hvnd_warn("retrying send completion packet\n"); > + udelay(100); > + } else { > + hvnd_error("unable to send completion packet ret=%d\n", > + ret); > + break; > + } > + } > + } > + > +} > diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/vmbus_rdma.h b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/vmbus_rdma.h > new file mode 100644 > index 000000000000..78e534fa8748 > --- /dev/null > +++ b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/vmbus_rdma.h > @@ -0,0 +1,2202 @@ > +/* > + * Copyright (c) 2014, Microsoft Corporation. > + * > + * Author: > + * K. Y. Srinivasan <kys@microsoft.com> > + * > + * This program is free software; you can redistribute it and/or modify it > + * under the terms of the GNU General Public License version 2 as published > + * by the Free Software Foundation. > + * > + * This program is distributed in the hope that it will be useful, but > + * WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or > + * NON INFRINGEMENT. See the GNU General Public License for more > + * details. > + * > + * Bug fixes/enhancements: Long Li <longli@microsoft.com> > + */ > + > + > +#ifndef _VMBUS_RDMA_H > +#define _VMBUS_RDMA_H > + > + > +#include <linux/in.h> > +#include <linux/in6.h> > +#include <rdma/ib_verbs.h> > +#include <linux/idr.h> > +#include <linux/if_ether.h> > + > +/* NetworkDirect version Numbers. > + */ > +#define ND_VERSION_1 0x1 > +#define ND_VERSION_2 0x20000 > + > +#ifndef NDVER > +#define NDVER ND_VERSION_2 > +#endif > + > +#define ND_ADAPTER_FLAG_IN_ORDER_DMA_SUPPORTED 0x00000001 > +#define ND_ADAPTER_FLAG_CQ_INTERRUPT_MODERATION_SUPPORTED 0x00000004 > +#define ND_ADAPTER_FLAG_MULTI_ENGINE_SUPPORTED 0x00000008 > +#define ND_ADAPTER_FLAG_CQ_RESIZE_SUPPORTED 0x00000100 > +#define ND_ADAPTER_FLAG_LOOPBACK_CONNECTIONS_SUPPORTED 0x00010000 > + > +#define ND_CQ_NOTIFY_ERRORS 0 > +#define ND_CQ_NOTIFY_ANY 1 > +#define ND_CQ_NOTIFY_SOLICITED 2 > + > +#define ND_MR_FLAG_ALLOW_LOCAL_WRITE 0x00000001 > +#define ND_MR_FLAG_ALLOW_REMOTE_READ 0x00000002 > +#define ND_MR_FLAG_ALLOW_REMOTE_WRITE 0x00000005 > +#define ND_MR_FLAG_RDMA_READ_SINK 0x00000008 > +#define ND_MR_FLAG_DO_NOT_SECURE_VM 0x80000000 > + > +#define ND_OP_FLAG_SILENT_SUCCESS 0x00000001 > +#define ND_OP_FLAG_READ_FENCE 0x00000002 > +#define ND_OP_FLAG_SEND_AND_SOLICIT_EVENT 0x00000004 > +#define ND_OP_FLAG_ALLOW_READ 0x00000008 > +#define ND_OP_FLAG_ALLOW_WRITE 0x00000010 > + > +#if NDVER >= ND_VERSION_2 > +#define ND_OP_FLAG_INLINE 0x00000020 > +#endif > + > +#define ND_AF_INET6 23 > +#define IF_MAX_ADDR_LENGTH 32 > + > +struct group_affinity { > + u64 mask; //KYS: usually 0 > + u16 group; // KYS usually -1 > + u16 reserved[3]; > +}; > + > +struct if_physical_addr { > + u16 length; > + u8 addr[IF_MAX_ADDR_LENGTH]; > +}; > + > +struct adapter_info_v2 { > + u32 info_version; > + u16 vendor_id; > + u16 device_id; > + u64 adapter_id; > + size_t max_registration_size; > + size_t max_window_size; > + u32 max_initiator_sge; > + u32 max_recv_sge; > + u32 max_read_sge; > + u32 max_transfer_length; > + u32 max_inline_data_size; > + u32 max_inbound_read_limit; > + u32 max_outbound_read_limit; > + u32 max_recv_q_depth; > + u32 max_initiator_q_depth; > + u32 max_shared_recv_q_depth; > + u32 max_completion_q_depth; > + u32 inline_request_threshold; > + u32 large_request_threshold; > + u32 max_caller_data; > + u32 max_callee_data; > + u32 adapter_flags; > +} __packed; > + > +struct nd2_adapter_info_32 { //KYS: Check what this is > + u32 info_version; > + u16 vendor_id; > + u16 devic_id; > + u64 adapter_id; > + u32 max_registration_size; > + u32 max_window_size; > + u32 max_initiator_sge; > + u32 max_recv_sge; > + u32 max_read_sge; > + u32 max_transfer_length; > + u32 max_inline_data_size; > + u32 max_inbound_read_limit; > + u32 max_outbound_read_limit; > + u32 max_recv_q_depth; > + u32 max_initiator_q_depth; > + u32 max_shared_recv_q_depth; > + u32 max_completion_q_depth; > + u32 inline_request_threshold; > + u32 large_request_threshold; > + u32 max_caller_data; > + u32 max_callee_data; > + u32 adapter_flags; > +} __packed; > + > +enum nd2_request_type { > + ND2_RT_RECEIVE, > + ND2_RT_SEND, > + ND2_RT_BIND, > + ND2_RT_INVALIDATE, > + ND2_RT_READ, > + ND2_RT_WRITE > +}; > + > +struct nd2_result { > + u32 status; > + u32 bytes_transferred; > + void *qp_ctx; > + void *request_ctx; > + enum nd2_request_type request_type; > +} __packed; > + > +struct nd2_sge { > + void *buffer; > + u32 buffer_length; > + u32 mr_token; > +} __packed; > + > +/* > + * The communication with the host via ioctls using VMBUS > + * as the transport. > + */ > + > +#define ND_IOCTL_VERSION 1 > + > +enum nd_mapping_type { > + ND_MAP_IOSPACE, > + ND_MAP_MEMORY, > + ND_MAP_MEMORY_COALLESCE, > + ND_MAP_PAGES, > + ND_MAP_PAGES_COALLESCE, > + ND_UNMAP_IOSPACE, > + ND_UNMAP_MEMORY, > + ND_MAX_MAP_TYPE > +}; > + > +enum nd_caching_type { > + ND_NON_CACHED = 0, > + ND_CACHED, > + ND_WRITE_COMBINED, > + ND_MAX_CACHE_TYPE > +}; > + > +enum nd_aceess_type { > + ND_READ_ACCESS = 0, > + ND_WRITE_ACCESS, > + ND_MODIFY_ACCESS > +}; > + > +struct nd_map_io_space { > + enum nd_mapping_type map_type; > + enum nd_caching_type cache_type; > + u32 cb_length; > +}; > + > +struct nd_map_memory { > + enum nd_mapping_type map_type; > + enum nd_aceess_type access_type; > + u64 address; > + u32 cb_length; > +}; > + > +struct nd_mapping_id { > + enum nd_mapping_type map_type; > + u64 id; > +}; > + > +struct ndk_map_pages { > + struct nd_map_memory header; > + u32 page_offset; > +}; > + > +union nd_mapping { > + enum nd_mapping_type map_type; > + struct nd_map_io_space map_io_space; > + struct nd_map_memory map_memory; > + struct nd_mapping_id mapping_id; > + struct ndk_map_pages map_pages; > +}; > + > +struct nd_mapping_result { > + u64 id; > + u64 info; > +}; > + > +struct nd_resource_descriptor { > + u64 handle; > + u32 ce_mapping_results; > + u32 cb_mapping_results_offset; > +}; > + > +struct nd_handle { > + u32 version; > + u32 reserved; > + u64 handle; > +}; > + > +union nd_sockaddr_inet { > + struct sockaddr_in ipv4; > + struct sockaddr_in6 ipv6; > + u16 address_family; //KYS how is this supposed to work? > +}; > + > +struct nd_address_element { > + union nd_sockaddr_inet addr; > + char mac_addr[ETH_ALEN]; > +}; > + > +struct nd_resolve_address { > + u32 version; > + u32 reserved; > + union nd_sockaddr_inet address; > +}; > + > +struct nd_open_adapter { > + u32 version; > + u32 reserved; > + u32 ce_mapping_cnt; > + u32 cb_mapping_offset; > + u64 adapter_id; > +}; > + > +struct nd_adapter_query { > + u32 version; > + u32 info_version; > + u64 adapter_handle; > +}; > + > +struct nd_create_cq { > + u32 version; > + u32 queue_depth; > + u32 ce_mapping_cnt; > + u32 cb_mapping_offset; > + u64 adapter_handle; > + struct group_affinity affinity; > +}; > + > +struct nd_create_srq { > + u32 version; > + u32 queue_depth; > + u32 ce_mapping_cnt; > + u32 cb_mapping_offset; > + u32 max_request_sge; > + u32 notify_threshold; > + u64 pd_handle; > + struct group_affinity affinity; > +}; > + > +struct nd_create_qp_hdr { > + u32 version; > + u32 cb_max_inline_data; > + u32 ce_mapping_cnt; > + u32 cb_mapping_offset; //KYS: what is this prefix - ce/cb > + u32 initiator_queue_depth; > + u32 max_initiator_request_sge; > + u64 receive_cq_handle; > + u64 initiator_cq_handle; > + u64 pd_handle; > +}; > + > +struct nd_create_qp { > + struct nd_create_qp_hdr hdr; > + u32 receive_queue_depth; > + u32 max_receive_request_sge; > +}; > + > +struct nd_create_qp_with_srq { > + struct nd_create_qp_hdr header; > + u64 srq_handle; > +}; > + > +struct nd_srq_modify { > + u32 version; > + u32 queue_depth; > + u32 ce_mapping_cnt; > + u32 cb_mapping_offset; > + u32 notify_threshold; > + u32 reserved; > + u64 srq_handle; > +}; > + > +struct nd_cq_modify { > + u32 version; > + u32 queue_depth; > + u32 ce_mapping_count; > + u32 cb_mappings_offset; > + u64 cq_handle; > +}; > + > +struct nd_cq_notify { > + u32 version; > + u32 type; > + u64 cq_handle; > +}; > + > +struct nd_mr_register_hdr { > + u32 version; > + u32 flags; > + u64 cb_length; > + u64 target_addr; > + u64 mr_handle; > +}; > + > +struct nd_mr_register { > + struct nd_mr_register_hdr header; > + u64 address; > +}; > + > +struct nd_bind { > + u32 version; > + u32 reserved; > + u64 handle; > + union nd_sockaddr_inet address; > +}; > + > +struct nd_read_limits { > + u32 inbound; > + u32 outbound; > +}; > + > +struct nd_connect { > + u32 version; > + u32 reserved; > + struct nd_read_limits read_limits; > + u32 cb_private_data_length; > + u32 cb_private_data_offset; > + u64 connector_handle; > + u64 qp_handle; > + union nd_sockaddr_inet destination_address; > + struct if_physical_addr phys_addr; > +}; > + > +struct nd_accept { > + u32 version; > + u32 reserved; > + struct nd_read_limits read_limits; > + u32 cb_private_data_length; > + u32 cb_private_data_offset; > + u64 connector_handle; > + u64 qp_handle; > +}; > + > +struct nd_reject { > + u32 version; > + u32 reserved; > + u32 cb_private_data_length; > + u32 cb_private_data_offset; > + u64 connector_handle; > +}; > + > +struct nd_listen { > + u32 version; > + u32 back_log; > + u64 listener_handle; > +}; > + > +struct nd_get_connection_request { > + u32 version; > + u32 reserved; > + u64 listener_handle; > + u64 connector_handle; > +}; > + > +enum ndv_mmio_type { > + ND_PARTITION_KERNEL_VIRTUAL, > + ND_PARTITION_SYSTEM_PHYSICAL, > + ND_PARTITION_GUEST_PHYSICAL, > + ND_MAXIMUM_MMIO_TYPE > +}; > + > +struct ndv_resolve_adapter_id { > + u32 version; > + struct if_physical_addr phys_addr; > +}; > + > +struct ndv_partition_create { > + u32 version; > + enum ndv_mmio_type mmio_type; > + u64 adapter_id; > + u64 xmit_cap; > +}; > + > +struct ndv_partition_bind_luid { > + u32 version; > + u32 reserved; > + u64 partition_handle; > + struct if_physical_addr phys_addr; > + //IF_LUID luid; //KYS? > +}; > + > +struct ndv_partition_bind_address { > + u32 version; > + u32 reserved; > + u64 partition_handle; > + union nd_sockaddr_inet address; > + struct if_physical_addr guest_phys_addr; > + struct if_physical_addr phys_addr; > +}; > + > +struct ndk_mr_register { > + struct nd_mr_register_hdr hdr; > + u32 cb_logical_page_addresses_offset; > +}; > + > +struct ndk_bind { > + struct nd_bind hdr; > + u64 authentication_id; > + bool is_admin; > +}; > + > +#define FDN 0x12 > +#define METHOD_BUFFERED 0x0 > +#define FAA 0x0 > + > +#define CTL_CODE( DeviceType, Function, Method, Access ) ( \ > + ((DeviceType) << 16) | ((Access) << 14) | ((Function) << 2) | (Method) \ > +) > + > +#define ND_FUNCTION(r_, i_) ((r_) << 6 | (i_)) > +#define IOCTL_ND(r_, i_) \ > + CTL_CODE( FDN, ND_FUNCTION((r_), (i_)), METHOD_BUFFERED, FAA ) > + > +#define ND_FUNCTION_FROM_CTL_CODE(ctrlCode_) ((ctrlCode_ >> 2) & 0xFFF) > +#define ND_RESOURCE_FROM_CTL_CODE(ctrlCode_) (ND_FUNCTION_FROM_CTL_CODE(ctrlCode_) >> 6) > +#define ND_OPERATION_FROM_CTRL_CODE(ctrlCode_) (ND_FUNCTION_FROM_CTL_CODE(ctrlCode_) & 0x3F) > + > +#define ND_DOS_DEVICE_NAME L"\\DosDevices\\Global\\NetworkDirect" > +#define ND_WIN32_DEVICE_NAME L"\\\\.\\NetworkDirect" > + > +enum nd_resource_type { > + ND_PROVIDER = 0, > + ND_ADAPTER, > + ND_PD, > + ND_CQ, > + ND_MR, > + ND_MW, > + ND_SRQ, > + ND_CONNECTOR, > + ND_LISTENER, > + ND_QP, > + ND_VIRTUAL_PARTITION, > + ND_RESOURCE_TYPE_COUNT > +}; > + > +#define ND_OPERATION_COUNT 14 > + > +#define IOCTL_ND_PROVIDER(i_) IOCTL_ND(ND_PROVIDER, i_) > +#define IOCTL_ND_ADAPTER(i_) IOCTL_ND(ND_ADAPTER, i_) > +#define IOCTL_ND_PD(i_) IOCTL_ND(ND_PD, i_) > +#define IOCTL_ND_CQ(i_) IOCTL_ND(ND_CQ, i_) > +#define IOCTL_ND_MR(i_) IOCTL_ND(ND_MR, i_) > +#define IOCTL_ND_MW(i_) IOCTL_ND(ND_MW, i_) > +#define IOCTL_ND_SRQ(i_) IOCTL_ND(ND_SRQ, i_) > +#define IOCTL_ND_CONNECTOR(i_) IOCTL_ND(ND_CONNECTOR, i_) > +#define IOCTL_ND_LISTENER(i_) IOCTL_ND(ND_LISTENER, i_) > +#define IOCTL_ND_QP(i_) IOCTL_ND(ND_QP, i_) > +#define IOCTL_ND_VIRTUAL_PARTITION(i_) IOCTL_ND(ND_VIRTUAL_PARTITION, i_) > + > +/* Provider IOCTLs */ > +#define IOCTL_ND_PROVIDER_INIT IOCTL_ND_PROVIDER( 0 ) > +#define IOCTL_ND_PROVIDER_BIND_FILE IOCTL_ND_PROVIDER( 1 ) > +#define IOCTL_ND_PROVIDER_QUERY_ADDRESS_LIST IOCTL_ND_PROVIDER( 2 ) > +#define IOCTL_ND_PROVIDER_RESOLVE_ADDRESS IOCTL_ND_PROVIDER( 3 ) > +#define IOCTL_ND_PROVIDER_MAX_OPERATION 4 > + > +/* Adapter IOCTLs */ > +#define IOCTL_ND_ADAPTER_OPEN IOCTL_ND_ADAPTER( 0 ) > +#define IOCTL_ND_ADAPTER_CLOSE IOCTL_ND_ADAPTER( 1 ) > +#define IOCTL_ND_ADAPTER_QUERY IOCTL_ND_ADAPTER( 2 ) > +#define IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST IOCTL_ND_ADAPTER( 3 ) > +#define IOCTL_ND_ADAPTER_MAX_OPERATION 4 > + > +/* Protection Domain IOCTLs */ > +#define IOCTL_ND_PD_CREATE IOCTL_ND_PD( 0 ) > +#define IOCTL_ND_PD_FREE IOCTL_ND_PD( 1 ) > +#define IOCTL_ND_PD_MAX_OPERATION 2 > + > +/* Completion Queue IOCTLs */ > +#define IOCTL_ND_CQ_CREATE IOCTL_ND_CQ( 0 ) > +#define IOCTL_ND_CQ_FREE IOCTL_ND_CQ( 1 ) > +#define IOCTL_ND_CQ_CANCEL_IO IOCTL_ND_CQ( 2 ) > +#define IOCTL_ND_CQ_GET_AFFINITY IOCTL_ND_CQ( 3 ) > +#define IOCTL_ND_CQ_MODIFY IOCTL_ND_CQ( 4 ) > +#define IOCTL_ND_CQ_NOTIFY IOCTL_ND_CQ( 5 ) > +#define IOCTL_ND_CQ_MAX_OPERATION 6 > + > +/* Memory Region IOCTLs */ > +#define IOCTL_ND_MR_CREATE IOCTL_ND_MR( 0 ) > +#define IOCTL_ND_MR_FREE IOCTL_ND_MR( 1 ) > +#define IOCTL_ND_MR_CANCEL_IO IOCTL_ND_MR( 2 ) > +#define IOCTL_ND_MR_REGISTER IOCTL_ND_MR( 3 ) > +#define IOCTL_ND_MR_DEREGISTER IOCTL_ND_MR( 4 ) > +#define IOCTL_NDK_MR_REGISTER IOCTL_ND_MR( 5 ) > +#define IOCTL_ND_MR_MAX_OPERATION 6 > + > +/* Memory Window IOCTLs */ > +#define IOCTL_ND_MW_CREATE IOCTL_ND_MW( 0 ) > +#define IOCTL_ND_MW_FREE IOCTL_ND_MW( 1 ) > +#define IOCTL_ND_MW_MAX_OPERATION 2 > + > +/* Shared Receive Queue IOCTLs */ > +#define IOCTL_ND_SRQ_CREATE IOCTL_ND_SRQ( 0 ) > +#define IOCTL_ND_SRQ_FREE IOCTL_ND_SRQ( 1 ) > +#define IOCTL_ND_SRQ_CANCEL_IO IOCTL_ND_SRQ( 2 ) > +#define IOCTL_ND_SRQ_GET_AFFINITY IOCTL_ND_SRQ( 3 ) > +#define IOCTL_ND_SRQ_MODIFY IOCTL_ND_SRQ( 4 ) > +#define IOCTL_ND_SRQ_NOTIFY IOCTL_ND_SRQ( 5 ) > +#define IOCTL_ND_SRQ_MAX_OPERATION 6 > + > +/* Connector IOCTLs */ > +#define IOCTL_ND_CONNECTOR_CREATE IOCTL_ND_CONNECTOR( 0 ) > +#define IOCTL_ND_CONNECTOR_FREE IOCTL_ND_CONNECTOR( 1 ) > +#define IOCTL_ND_CONNECTOR_CANCEL_IO IOCTL_ND_CONNECTOR( 2 ) > +#define IOCTL_ND_CONNECTOR_BIND IOCTL_ND_CONNECTOR( 3 ) > +#define IOCTL_ND_CONNECTOR_CONNECT IOCTL_ND_CONNECTOR( 4 ) > +#define IOCTL_ND_CONNECTOR_COMPLETE_CONNECT IOCTL_ND_CONNECTOR( 5 ) > +#define IOCTL_ND_CONNECTOR_ACCEPT IOCTL_ND_CONNECTOR( 6 ) > +#define IOCTL_ND_CONNECTOR_REJECT IOCTL_ND_CONNECTOR( 7 ) > +#define IOCTL_ND_CONNECTOR_GET_READ_LIMITS IOCTL_ND_CONNECTOR( 8 ) > +#define IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA IOCTL_ND_CONNECTOR( 9 ) > +#define IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS IOCTL_ND_CONNECTOR( 10 ) > +#define IOCTL_ND_CONNECTOR_GET_ADDRESS IOCTL_ND_CONNECTOR( 11 ) > +#define IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT IOCTL_ND_CONNECTOR( 12 ) > +#define IOCTL_ND_CONNECTOR_DISCONNECT IOCTL_ND_CONNECTOR( 13 ) > +#define IOCTL_ND_CONNECTOR_MAX_OPERATION 14 > + > +/* Listener IOCTLs */ > +#define IOCTL_ND_LISTENER_CREATE IOCTL_ND_LISTENER( 0 ) > +#define IOCTL_ND_LISTENER_FREE IOCTL_ND_LISTENER( 1 ) > +#define IOCTL_ND_LISTENER_CANCEL_IO IOCTL_ND_LISTENER( 2 ) > +#define IOCTL_ND_LISTENER_BIND IOCTL_ND_LISTENER( 3 ) > +#define IOCTL_ND_LISTENER_LISTEN IOCTL_ND_LISTENER( 4 ) > +#define IOCTL_ND_LISTENER_GET_ADDRESS IOCTL_ND_LISTENER( 5 ) > +#define IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST IOCTL_ND_LISTENER( 6 ) > +#define IOCTL_ND_LISTENER_MAX_OPERATION 7 > + > +/* Queue Pair IOCTLs */ > +#define IOCTL_ND_QP_CREATE IOCTL_ND_QP( 0 ) > +#define IOCTL_ND_QP_CREATE_WITH_SRQ IOCTL_ND_QP( 1 ) > +#define IOCTL_ND_QP_FREE IOCTL_ND_QP( 2 ) > +#define IOCTL_ND_QP_FLUSH IOCTL_ND_QP( 3 ) > +#define IOCTL_ND_QP_MAX_OPERATION 4 > + > +/* Kernel-mode only IOCTLs (IRP_MJ_INTERNAL_DEVICE_CONTROL) */ > +#define IOCTL_NDV_PARTITION_RESOLVE_ADAPTER_ID IOCTL_ND_VIRTUAL_PARTITION( 0 ) > +#define IOCTL_NDV_PARTITION_CREATE IOCTL_ND_VIRTUAL_PARTITION( 1 ) > +#define IOCTL_NDV_PARTITION_FREE IOCTL_ND_VIRTUAL_PARTITION( 2 ) > +#define IOCTL_NDV_PARTITION_BIND IOCTL_ND_VIRTUAL_PARTITION( 3 ) > +#define IOCTL_NDV_PARTITION_UNBIND IOCTL_ND_VIRTUAL_PARTITION( 4 ) > +#define IOCTL_NDV_PARTITION_BIND_LUID IOCTL_ND_VIRTUAL_PARTITION( 5 ) > +#define IOCTL_NDV_PARTITION_MAX_OPERATION 6 > + > + > +#define MB_SHIFT 20 > + > + > +/* Ringbuffer size for the channel */ > +#define NDV_NUM_PAGES_IN_RING_BUFFER 64 > + > +#define NDV_MAX_PACKETS_PER_RECEIVE 8 > + > +#define NDV_MAX_PACKET_COUNT 16304 > + > +#define NDV_MAX_NUM_OUTSTANDING_RECEIVED_PACKETS (16304) > +#define NDV_MAX_HANDLE_TABLE_SIZE (16304) > +#define NDV_HOST_MAX_HANDLE_TABLE_SIZE (NDV_MAX_HANDLE_TABLE_SIZE * 16) > + > + > +#define NDV_MAX_MAPPINGS 4 > + > +#define NDV_STATE_NONE 0x00000000 > +#define NDV_STATE_CREATED 0x00000001 > +#define NDV_STATE_CONNECTING 0x00000002 > +#define NDV_STATE_INITIALIZING 0x00000003 > +#define NDV_STATE_OPERATIONAL 0xEFFFFFFF > +#define NDV_STATE_FAILED 0xFFFFFFFF > + > + > +#define NDV_MAX_PRIVATE_DATA_SIZE 64 > +#define NDV_MAX_IOCTL_SIZE 256 > + > +/* max size of buffer for vector of ND_MAPPING */ > +#define NDV_MAX_MAPPING_BUFFER_SIZE \ > + (NDV_MAX_MAPPINGS * sizeof(union nd_mapping)) > + > +/* max expected ioctl buffer size from users */ > +#define NDV_MAX_IOCTL_BUFFER_SIZE \ > + (NDV_MAX_IOCTL_SIZE + \ > + NDV_MAX_MAPPING_BUFFER_SIZE + \ > + NDV_MAX_PRIVATE_DATA_SIZE) > + > +/* max PFN array for inline buffers */ > +#define NDV_MAX_INLINE_PFN_ARRAY_LENGTH 32 > + > +/* Field header size for inline buffer */ > +#define NDV_MAX_MAPPING_PACKET_FILED_BUFFER_SIZE \ > + (NDV_MAX_MAPPINGS * sizeof(NDV_PACKET_FIELD)) > + > +/* Max for a single field */ > + > +#define NDV_MAX_SINGLE_MAPPING_FIELD ( sizeof(GPA_RANGE) + \ > + (sizeof(PFN_NUMBER) * NDV_MAX_INLINE_PFN_ARRAY_LENGTH)) > + > +/* Max for all inine data */ > + > +#define NDV_MAX_MAPPING_DATA_SIZE (NDV_MAX_MAPPING_PACKET_FILED_BUFFER_SIZE + \ > + (NDV_MAX_MAPPINGS * NDV_MAX_SINGLE_MAPPING_FIELD)) > + > + > +#define NDV_MAX_PACKET_HEADER_SIZE 256 > + > +#define NDV_MAX_PACKET_SIZE (NDV_MAX_PACKET_HEADER_SIZE + \ > + NDV_MAX_IOCTL_BUFFER_SIZE + \ > + NDV_MAX_MAPPING_DATA_SIZE) > + > +/* Well known message type INIT is defined for the channel > + * not for the protocol. > + */ > + > +#define NDV_PACKET_TYPE_INIT 0xFFFFFFFF > + > +/* Invalid protocol version to to identify uninitialized channels */ > + > +#define NDV_PROTOCOL_VERSION_INVALID 0xFFFFFFFF > + > +/* Flags that control the bahavior of packet handling */ > + > +enum ndv_packet_options { > + NDV_PACKET_OPTION_NONE = 0x00, > + > + /* Indicates that the ExternalDataMdl parameter is expectected to be > + * passed and must be handled in the reciever. This call must be > + * handled specially to ensure that the MDL can be created correctly. > + */ > + NDV_PACKET_OPTION_EXTERNAL_DATA = 0x01, > + > + /* Inicates that the reciever must execution the handler at passive. */ > + NDV_PACKET_OPTIONS_REQUIRES_PASSIVE = 0x02, > + > + /* Indicates that the sender does not expect and is not waiting for a > + * response packet. > + */ > + NDV_PACKET_OPTIONS_POST = 0x04, > +}; > + > +#define NDV_PACKET_TYPE(id_, opt_) \ > + (((opt_)<<24) | (id_)) > + > +#define NDV_PACKET_TYPE_OPTIONS(type_) \ > + (((type_) >> 24) & 0xFF) > + > +#define NDV_PACKET_TYPE_ID(type_) \ > + ((type_) & 0xFFFFFF) \ > + > +#define NDV_ADD_PACKET_OPTION(type_, opt_) \ > + (type_) |= (opt_<<24) > + > +/* The header value sent on all packets */ > +union ndv_packet_hdr { > + > + struct { > + /* The type of packet. > + * This value should be created with the NDV_PACKET_TYPE macro > + * to include all packet options within the packet type. > + */ > + u32 packet_type; > + /* The size of the entire fixed message structure that exists > + * before the data. This must be >= sizeof(NDV_PACKET_HEADER) > + */ > + u32 hdr_sz; > + /* This size of the data that follows the message > + * data_sz + hdr_sz size gives the total size of > + * the buffer that is used. > + */ > + u32 data_sz; > + /* The status code used to indicate success or failure. > + * It is only used in completions and during responses. > + */ > + u32 status; //KYS: NTSTATUS? > + }; > + > + u64 padding[2]; //KYS: why? > +}; > + > + > +/* The core INIT packet. This message is defined in the channel > + * not in the protocol. This message should never change size > + * or behavior, as it could impact compatibility in the future. > + * This packet is used to negotiate the protocol version, so chaning > + * this size could break backward compat. > + */ > + > +union ndv_packet_init { > + struct { > + u32 packet_type; > + u32 protocol_version; > + u32 flags; > + }; > + u64 padding[2]; > +} __packed; > + > +#define NDV_PACKET_INIT_SIZE 16 > + > +/* Data packing flags used for accessing the dynamic fields inside a packet */ > +#define NDV_DATA_PACKING_2 0x1 > +#define NDV_DATA_PACKING_4 0x3 > +#define NDV_DATA_PACKING_8 0x7 > + > + > +#define NDV_PROTOCOL_VERSION_1 0x0100 > +#define NDV_PROTOCOL_VERSION_CURRENT NDV_PROTOCOL_VERSION_1 > +#define NDV_PROTOCOL_VERSION_COUNT 1 > + > +struct ndv_pkt_field { > + u32 size; > + u32 offset; > +}; > + > +enum ndv_pkt_id { > + NDV_PKT_UNKNOWN = 0, > + /* Version 1 Message ID's */ > + NDV_PKT_ID1_BIND, > + NDV_PKT_ID1_CREATE, > + NDV_PKT_ID1_CLEANUP, > + NDV_PKT_ID1_CANCEL, > + NDV_PKT_ID1_CONTROL, > + NDV_PKT_ID1_COMPLETE, > + NDV_PKT_ID1_INIT_RESOURCES, > +}; > + > +/* The guest will send this as the first messages just after init > + * The resources are reserved per channel. > + */ > + > +struct ndv_pkt_hdr_init_resources_1 { > + > + union ndv_packet_hdr pkt_hdr; > + u16 io_space_sz_mb; > + u64 io_space_start; > + > +}; > + > + > + > +/* The guest will send this packet to the host after channel init > + * to query support for the adapters that are registered. > + */ > + > +struct ndv_pkt_hdr_bind_1 { > + union ndv_packet_hdr pkt_hdr; > + bool unbind; > + union nd_sockaddr_inet ip_address; > + struct if_physical_addr phys_addr; > + u64 guest_id; > +}; > + > +union ndv_context_handle { > + u64 val64; > + struct { > + u32 local; > + u32 remote; > + }; > +}; > + > +struct ndv_pkt_hdr_create_1 { > + union ndv_packet_hdr pkt_hdr; > + > + /* Identifies the object used to track this file handle on both > + * the guest and the host. When sent from the guest, it will contain > + * the guest handle. On success, the host will populate and return > + * it's handle value as well. > + */ > + > + union ndv_context_handle handle; > + > + /* The parameters sent to the CreateFile call */ > + u32 access_mask; > + u32 open_options; > + > + u16 file_attributes; //KYS: This field must be 64 bit aligned > + > + u16 share_access; //KYS > + > + u32 kys_padding; //KYS > + > + u16 ea_length; //KYS; needs to be 64 bit aligned; what is ea length - unused > +}; > + > + > +struct ndv_pkt_hdr_cleanup_1 { > + union ndv_packet_hdr pkt_hdr; > + > + /* Identifies the object used to track this file handle on both > + * the guest and the host. When sent from the guest, it will contain > + * the both the guest and host handle values. The host will use this > + * value to cleanup its resource, then update its portion of the handle > + * to NDV_HANDLE_NULL before returning the data back to the guest. > + */ > + union ndv_context_handle handle; > +}; > + > +struct ndv_pkt_hdr_cancel_1 { > + union ndv_packet_hdr pkt_hdr; > + union ndv_context_handle file_handle; > + union ndv_context_handle irp_handle; > +}; > + > +struct ndv_bind_port_info { > + //LUID authentication_id; //KYS: LUID? > + bool is_admin; > +}; > + > +struct ndv_extended_data_flds { > + union { > + u32 field_count; > + u64 padding; > + }; > + > + //struct ndv_pkt_field fields[ANYSIZE_ARRAY]; //KYS? > +}; > + > + > +struct ndv_packet_hdr_control_1 { > + union ndv_packet_hdr pkt_hdr; > + /* Identifies the object used to track this file handle on both > + * the guest and the host. This should always have both guest > + * and host handle values inside it. > + */ > + > + union ndv_context_handle file_handle; > + > + /* The handle information for the allocated irp context object. > + * This information is used when the host/guest starts the cancelation > + */ > + union ndv_context_handle irp_handle; > + > + /* The input data describing in the IO control parameters */ > + > + u32 io_cntrl_code; > + u32 output_buf_sz; > + u32 input_buf_sz; > + u32 input_output_buf_offset; > + > + /* These are used in the return message to indicate the status of the IO > + * operation and the amount of data written to the output buffer. > + */ > + u32 io_status; //KYS: NTSTATUS? > + u32 bytes_returned; > + > + /* This contains the field information for additional data that is sent > + * with the packet that is IOCTL specific. > + */ > + > + struct ndv_pkt_field extended_data; > +}; > + > +/* > + * Include MLX specific defines. > + */ > + > +#include "mx_abi.h" > + > +/* Driver specific state. > + */ > + > +/* > + * We need to have host open a file; some > + * Windows constants for open. > + */ > +#define STANDARD_RIGHTS_ALL (0x001F0000L) > +#define FILE_ATTRIBUTE_NORMAL (0x80) > +#define FILE_SHARE_READ (0x00000001) > +#define FILE_SHARE_WRITE (0x00000002) > +#define FILE_SHARE_DELETE (0x00000004) > +#define FILE_FLAG_OVERLAPPED (0x40000000) > +#define FILE_SHARE_ALL (FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE) > +#define CREATE_ALWAYS (2) > +#define OPEN_EXISTING (3) > + > +#define RTL_NUMBER_OF(_x) \ > + sizeof(_x)/sizeof(_x[0]) > +/* > + * The context structure tracks the open state. > + */ > + > +/* > + * Packet layout for open adaptor. > + */ > + > +/* > + * Packet for querying the address list. > + */ > + > +union query_addr_list_ioctl { > + struct nd_handle in; > + union nd_sockaddr_inet out[16]; //KYS a max of 16 addresses > +}; > + > +struct pkt_query_addr_list { > + struct ndv_packet_hdr_control_1 hdr; > + union query_addr_list_ioctl ioctl; > + unsigned long activity_id; > +}; > + > + > +struct pkt_fld { > + u32 size; > + u32 offset; > +}; > + > +struct fld_data { > + union { > + u64 padding; > + }; > +}; > + > +struct extended_data_oad { > + union { > + u32 cnt; > + u64 padding; > + }; > + /* offsets are from start of extended data struct > + * and should start on 8 byte boundary > + */ > + struct pkt_fld fields[IBV_GET_CONTEXT_MAPPING_MAX]; > +}; > + > +union oad_ioctl { > + struct nd_open_adapter input; > + struct nd_resource_descriptor resrc_desc; > +}; > + > +union oad_mappings { > + struct ibv_get_context_req ctx_input; > + struct ibv_get_context_resp ctx_output; > +}; > + > +struct pkt_nd_open_adapter { > + struct ndv_packet_hdr_control_1 hdr; > + > + union oad_ioctl ioctl; > + union oad_mappings mappings; > + > + /* > + * Extended data. > + */ > + struct extended_data_oad ext_data; > +}; > + > +/* > + * Create CQ IOCTL. > + */ > + > +struct cq_db_gpa { > + u32 byte_count; > + u32 byte_offset; > + u64 pfn_array[2]; > +}; > + > +struct cq_sn_gpa { > + u32 byte_count; > + u32 byte_offset; > + u64 pfn_array[2]; > +}; > + > +struct create_cq_ext_data { > + union { > + u32 cnt; > + u64 padding; > + }; > + /* offsets are from start of extended data struct > + * and should start on 8 byte boundary > + */ > + struct pkt_fld fields[MLX4_IB_CREATE_CQ_MAPPING_MAX]; > + struct cq_db_gpa db_gpa; > + struct cq_sn_gpa sn_gpa; > + struct gpa_range cqbuf_gpa; > +}; > + > +union create_cq_ioctl { > + struct nd_create_cq input; > + struct nd_resource_descriptor resrc_desc; > +}; > + > +union create_cq_mappings { > + struct ibv_create_cq cq_in; > + struct ibv_create_cq_resp cq_resp; > +}; > + > +struct pkt_nd_create_cq { > + struct ndv_packet_hdr_control_1 hdr; > + > + union create_cq_ioctl ioctl; > + union create_cq_mappings mappings; > + > + /* > + * Extended data. > + */ > + struct create_cq_ext_data ext_data; > +}; > + > +/* > + * IOCTL to free CQ. > + */ > +struct free_cq_ioctl { > + struct nd_handle in; > +}; > + > +struct pkt_nd_free_cq { > + struct ndv_packet_hdr_control_1 hdr; > + > + struct free_cq_ioctl ioctl; > +}; > + > + > +/* > + * IOCTL to QUERY CQ - CQ NOTIFY > + */ > + > +struct notify_cq_ioctl { > + struct nd_cq_notify in; > +}; > + > +struct pkt_nd_notify_cq { > + struct ndv_packet_hdr_control_1 hdr; > + struct notify_cq_ioctl ioctl; > +}; > + > +/* > + * IOCTL to Create a listner > + */ > + > +struct nd_ep_create { > + struct nd_handle hdr; > + bool to_semantics; > + unsigned long activity_id; > +}; > + > +union listener_cr_ioctl { > + struct nd_ep_create in; > + u64 out; > +}; > + > +struct pkt_nd_cr_listener { > + struct ndv_packet_hdr_control_1 hdr; > + union listener_cr_ioctl ioctl; > +}; > + > +/* > + * IOCTL to free listener. > + */ > + > +struct listener_free_ioctl { > + struct nd_handle in; > +}; > + > +struct pkt_nd_free_listener { > + struct ndv_packet_hdr_control_1 hdr; > + struct listener_free_ioctl ioctl; > +}; > + > +/* > + * IOCTL for listener cancel IO. > + */ > +struct listener_cancelio_ioctl { > + struct nd_handle in; > +}; > + > +struct pkt_nd_cancelio_listener { > + struct ndv_packet_hdr_control_1 hdr; > + struct listener_cancelio_ioctl ioctl; > +}; > + > +/* > + * IOCTL for LISTENER BIND > + */ > + > +union listener_bind_ioctl { > + struct ndk_bind in; > +}; > + > +struct pkt_nd_bind_listener { > + struct ndv_packet_hdr_control_1 hdr; > + union listener_bind_ioctl ioctl; > +}; > + > +/* > + * After the listener is bound, enable > + * listening. > + */ > + > +union listener_listen_ioctl { > + struct nd_listen in; > +}; > + > +struct pkt_nd_listen_listener { > + struct ndv_packet_hdr_control_1 hdr; > + union listener_listen_ioctl ioctl; > +}; > + > +/* > + * IOCTL for getting the adddress from listener. > + * > + */ > + > +union listener_get_addr_ioctl { > + struct nd_handle in; > + union nd_sockaddr_inet out; > +}; > + > +struct pkt_nd_get_addr_listener { > + struct ndv_packet_hdr_control_1 hdr; > + union listener_get_addr_ioctl ioctl; > +}; > + > +/* > + * IOCTL to get a connection from a listener. > + */ > + > +union listener_get_connection_ioctl { > + struct nd_get_connection_request in; > + union nd_sockaddr_inet out; > +}; > + > +struct pkt_nd_get_connection_listener { > + struct ndv_packet_hdr_control_1 hdr; > + union listener_get_connection_ioctl ioctl; > +}; > + > + > +/* > + * Connector IOCTLs > + */ > + > +/* > + * IOCTL to create connector. > + */ > + > +union connector_cr_ioctl { //KYS should this be a union or struct? > + struct nd_ep_create in; > + u64 out; > +}; > + > +struct pkt_nd_cr_connector { > + struct ndv_packet_hdr_control_1 hdr; > + union connector_cr_ioctl ioctl; //KYS: union or struct > +}; > + > +/* > + * IOCTL to free connector. > + */ > + > +struct connector_free_ioctl { > + struct nd_handle in; > +}; > + > +struct pkt_nd_free_connector { > + struct ndv_packet_hdr_control_1 hdr; > + struct connector_free_ioctl ioctl; > +}; > + > +/* > + * IOCTL to cancel I/O on a connector. > + */ > + > +struct connector_cancelio_ioctl { > + struct nd_handle in; > +}; > + > +struct pkt_nd_cancelio_connector { > + struct ndv_packet_hdr_control_1 hdr; > + struct connector_cancelio_ioctl ioctl; > +}; > + > +/* > + * IOCTL to Bind an address to the connector. > + */ > + > +union connector_bind_ioctl { > + struct ndk_bind in; > +}; > + > +struct pkt_nd_bind_connector { > + struct ndv_packet_hdr_control_1 hdr; > + union connector_bind_ioctl ioctl; > +}; > + > +/* > + * IOCTL to connect a connector. > + */ > + > +struct connector_connect_in { > + struct nd_connect hdr; > + u8 retry_cnt; > + u8 rnr_retry_cnt; > + u8 priv_data[56]; > + unsigned long activity_id; > +}; > + > +union connector_connect_ioctl { > + struct connector_connect_in in; > +}; > + > +struct pkt_nd_connector_connect { > + struct ndv_packet_hdr_control_1 hdr; > + union connector_connect_ioctl ioctl; > +}; > + > +/* > + * IOCTL for connector complete connect > + */ > + > +struct complete_connect_in { > + struct nd_handle hdr; > + u8 rnr_nak_to; > + unsigned long activity_id; > +}; > + > +struct complete_connect_out { > + enum ibv_qp_state state; > +}; > + > +union connector_complete_connect_ioctl { > + struct complete_connect_in in; > + struct complete_connect_out out; > +}; > + > +struct pkt_nd_connector_connect_complete { > + struct ndv_packet_hdr_control_1 hdr; > + union connector_complete_connect_ioctl ioctl; > +}; > + > + > +#define MAX_PRIVATE_DATA_LEN 148 > + > +/* > + * IOCTL for connector accept. > + */ > + > +struct connector_accept_in { > + struct nd_accept hdr; > + u8 rnr_retry_cnt; > + u8 rnr_nak_to; > + u8 private_data[MAX_PRIVATE_DATA_LEN]; > + unsigned long activity_id; > +}; > + > +struct connector_accept_out { > + enum ibv_qp_state state; > +}; > + > +union connector_accept_ioctl { > + struct connector_accept_in in; > + struct connector_accept_out out; > +}; > + > +struct pkt_nd_connector_accept { > + struct ndv_packet_hdr_control_1 hdr; > + union connector_accept_ioctl ioctl; > +}; > + > +/* > + * IOCTL for connector to reject a connection. > + */ > + > +struct connector_reject_in { > + struct nd_reject hdr; > + u8 private_data[MAX_PRIVATE_DATA_LEN]; > +}; > + > +struct connector_reject_out { > + enum ibv_qp_state state; > +}; > + > +union connector_reject_ioctl { > + struct connector_reject_in in; > + struct connector_reject_out out; > +}; > + > +struct pkt_nd_connector_reject { > + struct ndv_packet_hdr_control_1 hdr; > + union connector_reject_ioctl ioctl; > +}; > + > +/* > + * IOCTL to get connector read limits. > + */ > + > +struct connector_get_rd_limits_in { > + struct nd_handle in; > +}; > + > +struct connector_get_rd_limits_out { > + struct nd_read_limits out; > +}; > + > +union connector_get_rd_limits_ioctl { > + struct connector_get_rd_limits_in in; > + struct connector_get_rd_limits_out out; > +}; > + > +struct pkt_nd_connector_get_rd_limits { > + struct ndv_packet_hdr_control_1 hdr; > + union connector_get_rd_limits_ioctl ioctl; > +}; > + > +/* > + * IOCTL to get connector private data. > + */ > +union connector_get_priv_data_ioctl { > + struct nd_handle in; > + u8 out[MAX_PRIVATE_DATA_LEN]; > +}; > + > +struct pkt_nd_connector_get_priv_data { > + struct ndv_packet_hdr_control_1 hdr; > + union connector_get_priv_data_ioctl ioctl; > +}; > + > + > +/* > + * IOCTL get peer address. > + */ > + > +union connector_get_peer_addr_ioctl { > + struct nd_handle in; > + union nd_sockaddr_inet out; > +}; > + > +struct pkt_nd_connector_get_peer_addr { > + struct ndv_packet_hdr_control_1 hdr; > + union connector_get_peer_addr_ioctl ioctl; > +}; > + > +/* > + * IOCTL to get connector address. > + */ > + > +union connector_get_addr_ioctl { > + struct nd_handle in; > + union nd_sockaddr_inet out; > +}; > + > +struct pkt_nd_connector_get_addr { > + struct ndv_packet_hdr_control_1 hdr; > + union connector_get_addr_ioctl ioctl; > +}; > + > +/* > + * IOCTL for disconnect notification. > + */ > + > +union connector_notify_disconnect_ioctl { > + struct nd_handle in; > +}; > + > +struct pkt_nd_connector_notify_disconnect { > + struct ndv_packet_hdr_control_1 hdr; > + union connector_notify_disconnect_ioctl ioctl; > +}; > + > +union connector_disconnect_ioctl { > + struct nd_handle in; > +}; > + > +struct pkt_nd_connector_disconnect { > + struct ndv_packet_hdr_control_1 hdr; > + union connector_notify_disconnect_ioctl ioctl; > +}; > + > +/* > + * IOCTLs for QP operations. > + */ > + > +/* > + * Create qp IOCTL. > + */ > + > +struct qp_db_gpa { > + u32 byte_count; > + u32 byte_offset; > + u64 pfn_array[1]; > +}; > + > +struct create_qp_ext_data { > + union { > + u32 cnt; > + u64 padding; > + }; > + /* offsets are from start of extended data struct > + * and should start on 8 byte boundary > + */ > + struct pkt_fld fields[MLX4_IB_CREATE_QP_MAPPINGS_MAX]; > + struct qp_db_gpa db_gpa; > + struct gpa_range qpbuf_gpa; > +}; > + > +union create_qp_ioctl { > + struct nd_create_qp input; > + struct nd_resource_descriptor resrc_desc; > +}; > + > +union create_qp_mappings { > + struct ibv_create_qp qp_in; > + struct ibv_create_qp_resp qp_resp; > +}; > + > +struct pkt_nd_create_qp { > + struct ndv_packet_hdr_control_1 hdr; > + > + union create_qp_ioctl ioctl; > + union create_qp_mappings mappings; > + > + /* > + * Extended data. > + */ > + struct create_qp_ext_data ext_data; > +}; > + > +/* > + * IOCTL to flush a QP. > + */ > +struct flush_qp_ioctl { > + struct nd_handle in; > + enum ibv_qp_state out; > +}; > + > +struct pkt_nd_flush_qp { > + struct ndv_packet_hdr_control_1 hdr; > + struct flush_qp_ioctl ioctl; > +}; > + > +/* > + * Memory Region IOCTLS > + */ > +union create_mr_ioctl { > + struct nd_handle in; > + u64 out; > +}; > + > +struct pkt_nd_create_mr { > + struct ndv_packet_hdr_control_1 hdr; > + union create_mr_ioctl ioctl; > +}; > + > +struct mr_out { > + u32 lkey; > + u32 rkey; > + unsigned long activity_id; > +}; > + > + > +union register_mr_ioctl { > + struct nd_mr_register in; > + struct mr_out out; > +}; > + > +struct pkt_nd_register_mr { > + struct ndv_packet_hdr_control_1 hdr; > + union register_mr_ioctl ioctl; > +}; > + > +struct deregister_mr_ioctl { > + struct nd_handle in; > +}; > + > +struct pkt_nd_deregister_mr { > + struct ndv_packet_hdr_control_1 hdr; > + struct deregister_mr_ioctl ioctl; > +}; > + > +/* > + * IOCTL to disconnect connector > + */ > + > +/* > + * Create PD IOCTL. > + */ > +struct nd_create_pd_ioctl { > + union { > + struct nd_handle in; > + u64 out_handle; > + }; > + struct ibv_alloc_pd_resp resp; > +}; > + > +struct pkt_nd_pd_create { > + struct ndv_packet_hdr_control_1 hdr; > + struct nd_create_pd_ioctl ioctl; > +}; > + > +/* > + * Free Handle. Check the layout with Luke. > + * > + */ > +struct free_handle_ioctl { > + struct nd_handle in; > +}; > + > +struct pkt_nd_free_handle { > + struct ndv_packet_hdr_control_1 hdr; > + struct free_handle_ioctl ioctl; > +}; > + > +/* > + * Cancel I/O. > + */ > + > +struct cancel_io_ioctl { > + struct nd_handle in; > +}; > + > +struct pkt_nd_cancel_io { > + struct ndv_packet_hdr_control_1 hdr; > + struct cancel_io_ioctl ioctl; > +}; > + > +/* > + * Connector states: > + */ > + > +enum connector_state { > + HVND_CON_INCOMING, > + HVND_CON_INCOMING_ESTABLISHED, > + HVND_CON_INCOMING_REJECTED, > + HVND_CON_OUTGOING_REQUEST > +}; > + > + > +/* > + * Adaptor query IOCTL. > + */ > +struct nd_adap_query_ioctl { > + union { > + struct nd_adapter_query ad_q; > + struct adapter_info_v2 ad_info; > + }; > +}; > + > +struct pkt_nd_query_adaptor { > + struct ndv_packet_hdr_control_1 hdr; > + struct nd_adap_query_ioctl ioctl; > +}; > + > +struct nd_ioctl { > + union { > + struct nd_handle handle; > + u8 raw_buffer[NDV_MAX_IOCTL_BUFFER_SIZE]; > + }; > +}; > + > +struct pkt_nd_provider_ioctl { > + struct ndv_packet_hdr_control_1 hdr; > + struct nd_ioctl ioctl; > +}; > + > +struct hvnd_ib_pd { > + struct ib_pd ibpd; > + u32 pdn; > + u64 handle; > +}; > + > +struct hvnd_work { > + struct work_struct work; > + void *callback_arg; > +}; > + > +struct hvnd_disconnect_work { > + struct work_struct work; > + int status; > + void *callback_arg; > +}; > + > +/* > +struct hvnd_delayed_work { > + struct delayed_work work; > + void *callback_arg; > +}; > +*/ > + > +enum hvnd_cm_state { > + hvnd_cm_idle = 0, > + hvnd_cm_connect_reply_sent, //active > + hvnd_cm_connect_reply_refused, > + hvnd_cm_connect_received, //active > + hvnd_cm_connect_request_sent, //passive > + hvnd_cm_accept_sent, > + hvnd_cm_close_sent, > + hvnd_cm_established_sent, > +}; > + > +struct incoming_pkt { > + struct list_head list_entry; > + char pkt[0]; > +}; > + > +struct hvnd_ep_obj { > +/* > + spinlock_t ep_lk; > + bool to_be_destroyed; > + bool io_outstanding; > + > + wait_queue_head_t wait; > + bool stopped; > + atomic_t process_refcnt; // how many NDV_PKT_ID1_COMPLETE packets we are currently processing > +*/ > + bool stopping; > + wait_queue_head_t wait_pending; > + atomic_t nr_requests_pending; > + > + enum nd_resource_type type; > + enum connector_state state; //KYS need to look at locking > + struct iw_cm_id *cm_id; > + enum hvnd_cm_state cm_state; > + struct completion block_event; > + struct completion disconnect_event; > + struct completion connector_accept_event; > + int connector_accept_status; > + u64 ep_handle; > + spinlock_t incoming_pkt_list_lock; > + struct list_head incoming_pkt_list; > + struct hvnd_ep_obj *parent; > + struct hvnd_dev *nd_dev; > + struct hvnd_ucontext *uctx; > + struct hvnd_work wrk; > + struct hvnd_cq *cq; > + u8 ord; > + u8 ird; > + char priv_data[MAX_PRIVATE_DATA_LEN]; > + bool incoming; > + atomic_t disconnect_notified; > + u64 outstanding_handle; > + u32 local_irp; > + struct hvnd_ep_obj *outstanding_ep; > + struct pkt_nd_connector_connect connector_connect_pkt; > + int connector_connect_retry; > +}; > + > +struct hvnd_ucontext { > + struct ib_ucontext ibucontext; > + struct list_head listentry; > + struct ndv_pkt_hdr_create_1 create_pkt; > + struct ndv_pkt_hdr_create_1 create_pkt_ovl; /* Overlap handle */ > + struct pkt_nd_provider_ioctl pr_init_pkt; > + union ndv_context_handle file_handle; > + union ndv_context_handle file_handle_ovl; > + > + struct pkt_nd_open_adapter o_adap_pkt; > + > + u64 adaptor_hdl; > + > + /* > + * Protection domain state. > + */ > + struct pkt_nd_pd_create pd_cr_pkt; > + > + u64 uar_base; > + u64 bf_base; > + u32 bf_buf_size; > + u32 bf_offset; > + u32 cqe_size; > + u32 max_qp_wr; > + u32 max_sge; > + u32 max_cqe; > + u32 num_qps; > + > + /* > + * State to manage dorbell pages: > + */ > + struct list_head db_page_list; > + struct mutex db_page_mutex; > + > + atomic_t refcnt; > + > +}; > + > +struct hvnd_dev { > + struct ib_device ibdev; > + struct hv_device *hvdev; > + u32 device_cap_flags; > + unsigned char nports; > + bool ib_active; > + > + /* State to manage interaction with the host. > + */ > + > + spinlock_t uctxt_lk; > + struct list_head listentry; > + > + unsigned long mmio_sz; > + unsigned long mmio_start_addr; > + struct resource mmio_resource; > + void *mmio_virt; > + > + unsigned long negotiated_version; > + union ndv_packet_init init_pkt; > + struct ndv_pkt_hdr_init_resources_1 resources; > + struct ndv_pkt_hdr_bind_1 bind_pkt; > + > + struct ndv_pkt_hdr_create_1 global_create_pkt; > + union ndv_context_handle global_file_handle; > + > + struct semaphore query_pkt_sem; > + bool query_pkt_set; > + struct pkt_nd_query_adaptor query_pkt; > + > + /* > + * ID tables. > + */ > + spinlock_t id_lock; > + > + struct idr cqidr; > + struct idr qpidr; > + struct idr mmidr; > + struct idr irpidr; > + struct idr uctxidr; > + atomic_t open_cnt; > + > + char ip_addr[4]; > + char mac_addr[6]; > + struct completion addr_set; > + int bind_complete; > + struct mutex bind_mutex; > +}; > + > +struct hvnd_cq { > + struct ib_cq ibcq; > + void *cq_buf; > + void *db_addr; > + u32 arm_sn; > + u32 entries; > + > + u32 cqn; > + u32 cqe; > + u64 cq_handle; > + > + struct ib_umem *umem; > + struct ib_umem *db_umem; > + struct mlx4_ib_user_db_page user_db_page; > + struct hvnd_ucontext *uctx; > + struct hvnd_ep_obj ep_object; //KYS need to clean this up; have a cq irp state > + bool monitor; > + bool upcall_pending; > +}; > + > +struct hvnd_qp { > + struct ib_qp ibqp; > + void *qp_buf; > + void *db_addr; > + u32 buf_size; > + u8 port; > + struct hvnd_dev *nd_dev; > + > + __u8 log_sq_bb_count; > + __u8 log_sq_stride; > + __u8 sq_no_prefetch; > + > + int rq_wqe_cnt; > + int rq_wqe_shift; > + int rq_max_gs; > + > + int sq_wqe_cnt; > + int sq_wqe_shift; > + int sq_max_gs; > + > + u32 max_inline_data; > + > + u32 initiator_q_depth; > + u32 initiator_request_sge; > + > + u32 receive_q_depth; > + u32 receive_request_sge; > + > + struct hvnd_cq *recv_cq; > + struct hvnd_cq *send_cq; > + > + u64 receive_cq_handle; > + u64 initiator_cq_handle; > + u64 pd_handle; > + > + u64 qp_handle; > + u32 qpn; > + u32 max_send_wr; > + u32 max_recv_wr; > + u32 max_send_sge; > + u32 max_recv_sge; > + > + struct ib_umem *umem; > + struct ib_umem *db_umem; > + struct mlx4_ib_user_db_page user_db_page; > + struct hvnd_ucontext *uctx; > + struct iw_cm_id *cm_id; > + > + /* > + * Current QP state; need to look at locking. > + * XXXKYS > + */ > + enum ib_qp_state qp_state; > + bool cq_notify; > + wait_queue_head_t wait; > + atomic_t refcnt; > + struct hvnd_ep_obj *connector; > +}; > + > +struct hvnd_mr { > + struct ib_mr ibmr; > + struct hvnd_ib_pd *pd; > + struct ib_umem *umem; > + u64 start; > + u64 length; > + u64 virt; > + int acc; > + u64 mr_handle; > + u32 mr_lkey; > + u32 mr_rkey; > +}; > + > +struct hvnd_cookie { > + struct completion host_event; > + void *pkt; > +}; > + > +/* > + * Definitions to retrieve the IP address. > + */ > + > +#define HVND_CURRENT_VERSION 0 > + > +struct hvnd_ipaddr_tuple { > + char mac_address[ETH_ALEN]; > + struct sockaddr addr; > +}; > + > +struct hvnd_msg { > + int status; > + struct hvnd_ipaddr_tuple ip_tuple; > +}; > + > +static inline struct hvnd_ib_pd *to_nd_pd(struct ib_pd *pd) > +{ > + return container_of(pd, struct hvnd_ib_pd, ibpd); > +} > + > +static inline struct hvnd_dev *to_nd_dev(struct ib_device *ibdev) > +{ > + return container_of(ibdev, struct hvnd_dev, ibdev); > +} > + > +static inline struct hvnd_cq *to_nd_cq(struct ib_cq *ibcq) > +{ > + return container_of(ibcq, struct hvnd_cq, ibcq); > +} > + > +static inline struct hvnd_qp *to_nd_qp(struct ib_qp *ibqp) > +{ > + return container_of(ibqp, struct hvnd_qp, ibqp); > +} > + > +static inline struct hvnd_ucontext *to_nd_context(struct ib_ucontext *ibucontext) > +{ > + return container_of(ibucontext, struct hvnd_ucontext, ibucontext); > +} > + > +static inline struct hvnd_ucontext *get_uctx_from_pd(struct ib_pd *pd) > +{ > + return to_nd_context(pd->uobject->context); > +} > + > +static inline struct hvnd_mr *to_nd_mr(struct ib_mr *ibmr) > +{ > + return container_of(ibmr, struct hvnd_mr, ibmr); > +} > +/* > + * ID management. > + */ > + > +static inline int insert_handle(struct hvnd_dev *dev, struct idr *idr, > + void *handle, u32 id) > +{ > + int ret; > + unsigned long flags; > + > + idr_preload(GFP_KERNEL); > + spin_lock_irqsave(&dev->id_lock, flags); > + > + ret = idr_alloc(idr, handle, id, id + 1, GFP_ATOMIC); > + > + spin_unlock_irqrestore(&dev->id_lock, flags); > + idr_preload_end(); > + > + BUG_ON(ret == -ENOSPC); > + return ret < 0 ? ret : 0; > +} > + > +static inline void remove_handle(struct hvnd_dev *dev, struct idr *idr, u32 id) > +{ > + unsigned long flags; > + > + spin_lock_irqsave(&dev->id_lock, flags); > + idr_remove(idr, id); > + spin_unlock_irqrestore(&dev->id_lock, flags); > +} > + > +static inline struct hvnd_cq *get_cqp(struct hvnd_dev *dev, u32 cqid) > +{ > + struct hvnd_cq *cqp; > + unsigned long flags; > + > + spin_lock_irqsave(&dev->id_lock, flags); > + cqp = idr_find(&dev->cqidr, cqid); > + spin_unlock_irqrestore(&dev->id_lock, flags); > + > + return cqp; > +} > + > +static inline struct hvnd_qp *get_qpp(struct hvnd_dev *dev, u32 qpid) > +{ > + struct hvnd_qp *qpp; > + unsigned long flags; > + > + spin_lock_irqsave(&dev->id_lock, flags); > + qpp = idr_find(&dev->qpidr, qpid); > + spin_unlock_irqrestore(&dev->id_lock, flags); > + > + return qpp; > +} > + > +static inline struct hvnd_ucontext *get_uctx(struct hvnd_dev *dev, u32 pid) > +{ > + struct hvnd_ucontext *uctx; > + unsigned long flags; > + > + spin_lock_irqsave(&dev->id_lock, flags); > + uctx = idr_find(&dev->uctxidr, pid); > + spin_unlock_irqrestore(&dev->id_lock, flags); > + > + return uctx; > +} > + > + > +static inline void *map_irp_to_ctx(struct hvnd_dev *nd_dev, u32 irp) > +{ > + void *ctx; > + unsigned long flags; > + > + spin_lock_irqsave(&nd_dev->id_lock, flags); > + ctx = idr_find(&nd_dev->irpidr, irp); > + spin_unlock_irqrestore(&nd_dev->id_lock, flags); > + > + return ctx; > +} > + > + > + > +void hvnd_callback(void *context); > +int hvnd_negotiate_version(struct hvnd_dev *nd_dev); > +int hvnd_init_resources(struct hvnd_dev *nd_dev); > +int hvnd_bind_nic(struct hvnd_dev *nd_dev, bool un_bind, char *ip_addr, char *mac_addr); > +int hvnd_open_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx); > +int hvnd_close_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx); > +int hvnd_query_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx); > +int hvnd_create_pd(struct hvnd_ucontext *uctx, struct hvnd_dev *nd_dev, > + struct hvnd_ib_pd *hvnd_pd); > + > +/* > + * CQ operations. > + */ > +int hvnd_create_cq(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + struct hvnd_cq *cq); > +int hvnd_destroy_cq(struct hvnd_dev *nd_dev, struct hvnd_cq *cq); > +int hvnd_notify_cq(struct hvnd_dev *nd_dev, struct hvnd_cq *cq, > + u32 notify_type, u64 irp_handle); > + > +/* > + * QP operations. > + */ > +int hvnd_create_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + struct hvnd_qp *qp); > + > +int hvnd_free_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + struct hvnd_qp *qp); > + > +int hvnd_flush_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + struct hvnd_qp *qp); > + > +/* > + * MR operations. > + */ > + > +int hvnd_cr_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 pd_handle, u64 *mr_handle); > + > +int hvnd_free_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 handle); > + > +int hvnd_mr_register(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + struct hvnd_mr *mr); > +int hvnd_deregister_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 handle); > + > +/* > + * Listner operations > + */ > +int hvnd_cr_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, u64 *handle); > + > + > +int hvnd_free_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 listener_handle); > + > +int hvnd_bind_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 listener_handle, union nd_sockaddr_inet *addr); > + > +int hvnd_listen_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 listener_handle, u32 backlog); > + > +int hvnd_get_addr_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 listener_handle, union nd_sockaddr_inet *addr); > + > +int hvnd_get_connection_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 listener_handle, u64 connector_handle, > + u64 irp_handle); > + > +/* > + * Connector operations. > + */ > +int hvnd_cr_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 *connector_handle); > + > +int hvnd_free_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 handle); > + > +int hvnd_cancelio_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 handle); > +int hvnd_bind_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 handle, union nd_sockaddr_inet *addr); > + > +int hvnd_connector_connect(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 connector_handle, u32 in_rd_limit, u32 out_rd_limit, > + u32 priv_data_length, const u8 *priv_data, > + u64 qp_handle, struct if_physical_addr *phys_addr, > + union nd_sockaddr_inet *dest_addr, struct hvnd_ep_obj *ep); > + > +int hvnd_connector_complete_connect(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 connector_handle, enum ibv_qp_state *qp_state); > + > +int hvnd_connector_accept(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 connector_handle, > + u64 qp_handle, > + u32 in_rd_limit, u32 out_rd_limit, > + u32 priv_data_length, const u8 *priv_data, > + enum ibv_qp_state *qp_state, struct hvnd_ep_obj *ep); > + > +int hvnd_connector_reject(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 connector_handle, > + u32 priv_data_length, u8 *priv_data, > + enum ibv_qp_state *qp_state); > + > +int hvnd_connector_get_rd_limits(struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx, > + u64 connector_handle, > + struct nd_read_limits *rd_limits); > + > +int hvnd_connector_get_priv_data(struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx, > + u64 connector_handle, > + u8 *priv_data); > + > +int hvnd_connector_get_peer_addr(struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx, > + u64 connector_handle, > + union nd_sockaddr_inet *peer_addr); > + > +int hvnd_connector_get_local_addr(struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx, > + u64 connector_handle, > + union nd_sockaddr_inet *local_addr); > + > +int hvnd_connector_notify_disconnect(struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx, > + u64 connector_handle, struct hvnd_ep_obj *ep); > + > + > +int hvnd_connector_disconnect(struct hvnd_dev *nd_dev, > + struct hvnd_ucontext *uctx, > + u64 connector_handle, struct hvnd_ep_obj *ep); > + > +int hvnd_free_handle(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + u64 handle, u32 ioctl); > + > +int hvnd_cancel_io(struct hvnd_ep_obj *ep_object); > + > +char *hvnd_get_op_name(int ioctl); > + > +void hvnd_acquire_uctx_ref(struct hvnd_ucontext *uctx); > +void hvnd_drop_uctx_ref(struct hvnd_dev *nd_dev,struct hvnd_ucontext *uctx); > +void hvnd_process_events(struct work_struct *work); > + > +void hvnd_process_cq_event_pending(struct hvnd_ep_obj *ep, int status); > +void hvnd_process_cq_event_complete(struct hvnd_ep_obj *ep, int status); > +void hvnd_process_connector_accept(struct hvnd_ep_obj *ep_object, int status); > +void hvnd_process_notify_disconnect(struct hvnd_ep_obj *ep_object, int status); > +void hvnd_process_disconnect(struct hvnd_ep_obj *ep_object, int status); > + > +void put_irp_handle(struct hvnd_dev *nd_dev, u32 irp); > +int get_irp_handle(struct hvnd_dev *nd_dev, u32 *local, void *irp_ctx); > + > +void hvnd_init_hdr(struct ndv_packet_hdr_control_1 *hdr, > + u32 data_sz, u32 local, u32 remote, > + u32 ioctl_code, > + u32 ext_data_sz, u32 ext_data_offset, > + u64 irp_handle); > + > +int hvnd_send_ioctl_pkt(struct hvnd_dev *nd_dev, > + struct ndv_packet_hdr_control_1 *hdr, > + u32 pkt_size, u64 cookie); > + > +int hvnd_get_outgoing_rdma_addr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > + union nd_sockaddr_inet *og_addr); > + > +int hvnd_get_neigh_mac_addr(struct sockaddr *local, struct sockaddr *remote, char *mac_addr); > + > +void hvnd_addr_init(void); > + > +void hvnd_addr_deinit(void); > + > +bool ep_add_work_pending(struct hvnd_ep_obj *ep_object); > +void ep_del_work_pending(struct hvnd_ep_obj *ep_object); > +void ep_stop(struct hvnd_ep_obj *ep_object); > + > +#define current_pid() (current->pid) > +/* > + * NT STATUS defines. > + */ > + > +#define STATUS_SUCCESS 0x0 > +#define STATUS_PENDING 0x00000103 > +#define STATUS_CANCELLED 0xC0000120 > +#define STATUS_DISCONNECTED 0xC000020C > +#define STATUS_TIMEOUT 0xC00000B5 > + > +void inc_ioctl_counter_request(unsigned ioctl); > +void inc_ioctl_counter_response(unsigned ioctl); > + > +#define NDV_PROTOCOL_VAERSION_INVALID -1 > +#define NDV_PACKET_INIT_SIZE 16 /* Size of the INIT packet */ > + > +#define HVND_RING_SZ (PAGE_SIZE * 64) > + > +/* logging levels */ > +#define HVND_ERROR 0 > +#define HVND_WARN 1 > +#define HVND_INFO 2 > +#define HVND_DEBUG 3 > + > +extern int hvnd_log_level; > + > +#define hvnd_error(fmt, args...) hvnd_log(HVND_ERROR, fmt, ##args) > +#define hvnd_warn(fmt, args...) hvnd_log(HVND_WARN, fmt, ##args) > +#define hvnd_info(fmt, args...) hvnd_log(HVND_INFO, fmt, ##args) > +#define hvnd_debug(fmt, args...) hvnd_log(HVND_DEBUG, fmt, ##args) > + > +#define hvnd_log(level, fmt, args...) \ > +do { \ > + if (unlikely(hvnd_log_level >= (level))) \ > + printk(KERN_ERR "hvnd %s[%u]: " fmt, __func__, __LINE__, ##args); \ > +} while (0) > + > +#endif /* _VMBUS_RDMA_H */ > diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/COPYING b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/COPYING > deleted file mode 100644 > index f0960c946b07..000000000000 > --- a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/COPYING > +++ /dev/null > @@ -1,339 +0,0 @@ > - > -"This software program is licensed subject to the GNU General Public License > -(GPL). Version 2, June 1991, available at > -<http://www.fsf.org/copyleft/gpl.html>" > - > -GNU General Public License > - > -Version 2, June 1991 > - > -Copyright (C) 1989, 1991 Free Software Foundation, Inc. > -51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA > - > -Everyone is permitted to copy and distribute verbatim copies of this license > -document, but changing it is not allowed. > - > -Preamble > - > -The licenses for most software are designed to take away your freedom to > -share and change it. By contrast, the GNU General Public License is intended > -to guarantee your freedom to share and change free software--to make sure > -the software is free for all its users. This General Public License applies > -to most of the Free Software Foundation's software and to any other program > -whose authors commit to using it. (Some other Free Software Foundation > -software is covered by the GNU Library General Public License instead.) You > -can apply it to your programs, too. > - > -When we speak of free software, we are referring to freedom, not price. Our > -General Public Licenses are designed to make sure that you have the freedom > -to distribute copies of free software (and charge for this service if you > -wish), that you receive source code or can get it if you want it, that you > -can change the software or use pieces of it in new free programs; and that > -you know you can do these things. > - > -To protect your rights, we need to make restrictions that forbid anyone to > -deny you these rights or to ask you to surrender the rights. These > -restrictions translate to certain responsibilities for you if you distribute > -copies of the software, or if you modify it. > - > -For example, if you distribute copies of such a program, whether gratis or > -for a fee, you must give the recipients all the rights that you have. You > -must make sure that they, too, receive or can get the source code. And you > -must show them these terms so they know their rights. > - > -We protect your rights with two steps: (1) copyright the software, and (2) > -offer you this license which gives you legal permission to copy, distribute > -and/or modify the software. > - > -Also, for each author's protection and ours, we want to make certain that > -everyone understands that there is no warranty for this free software. If > -the software is modified by someone else and passed on, we want its > -recipients to know that what they have is not the original, so that any > -problems introduced by others will not reflect on the original authors' > -reputations. > - > -Finally, any free program is threatened constantly by software patents. We > -wish to avoid the danger that redistributors of a free program will > -individually obtain patent licenses, in effect making the program > -proprietary. To prevent this, we have made it clear that any patent must be > -licensed for everyone's free use or not licensed at all. > - > -The precise terms and conditions for copying, distribution and modification > -follow. > - > -TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION > - > -0. This License applies to any program or other work which contains a notice > - placed by the copyright holder saying it may be distributed under the > - terms of this General Public License. The "Program", below, refers to any > - such program or work, and a "work based on the Program" means either the > - Program or any derivative work under copyright law: that is to say, a > - work containing the Program or a portion of it, either verbatim or with > - modifications and/or translated into another language. (Hereinafter, > - translation is included without limitation in the term "modification".) > - Each licensee is addressed as "you". > - > - Activities other than copying, distribution and modification are not > - covered by this License; they are outside its scope. The act of running > - the Program is not restricted, and the output from the Program is covered > - only if its contents constitute a work based on the Program (independent > - of having been made by running the Program). Whether that is true depends > - on what the Program does. > - > -1. You may copy and distribute verbatim copies of the Program's source code > - as you receive it, in any medium, provided that you conspicuously and > - appropriately publish on each copy an appropriate copyright notice and > - disclaimer of warranty; keep intact all the notices that refer to this > - License and to the absence of any warranty; and give any other recipients > - of the Program a copy of this License along with the Program. > - > - You may charge a fee for the physical act of transferring a copy, and you > - may at your option offer warranty protection in exchange for a fee. > - > -2. You may modify your copy or copies of the Program or any portion of it, > - thus forming a work based on the Program, and copy and distribute such > - modifications or work under the terms of Section 1 above, provided that > - you also meet all of these conditions: > - > - * a) You must cause the modified files to carry prominent notices stating > - that you changed the files and the date of any change. > - > - * b) You must cause any work that you distribute or publish, that in > - whole or in part contains or is derived from the Program or any part > - thereof, to be licensed as a whole at no charge to all third parties > - under the terms of this License. > - > - * c) If the modified program normally reads commands interactively when > - run, you must cause it, when started running for such interactive > - use in the most ordinary way, to print or display an announcement > - including an appropriate copyright notice and a notice that there is > - no warranty (or else, saying that you provide a warranty) and that > - users may redistribute the program under these conditions, and > - telling the user how to view a copy of this License. (Exception: if > - the Program itself is interactive but does not normally print such > - an announcement, your work based on the Program is not required to > - print an announcement.) > - > - These requirements apply to the modified work as a whole. If identifiable > - sections of that work are not derived from the Program, and can be > - reasonably considered independent and separate works in themselves, then > - this License, and its terms, do not apply to those sections when you > - distribute them as separate works. But when you distribute the same > - sections as part of a whole which is a work based on the Program, the > - distribution of the whole must be on the terms of this License, whose > - permissions for other licensees extend to the entire whole, and thus to > - each and every part regardless of who wrote it. > - > - Thus, it is not the intent of this section to claim rights or contest > - your rights to work written entirely by you; rather, the intent is to > - exercise the right to control the distribution of derivative or > - collective works based on the Program. > - > - In addition, mere aggregation of another work not based on the Program > - with the Program (or with a work based on the Program) on a volume of a > - storage or distribution medium does not bring the other work under the > - scope of this License. > - > -3. You may copy and distribute the Program (or a work based on it, under > - Section 2) in object code or executable form under the terms of Sections > - 1 and 2 above provided that you also do one of the following: > - > - * a) Accompany it with the complete corresponding machine-readable source > - code, which must be distributed under the terms of Sections 1 and 2 > - above on a medium customarily used for software interchange; or, > - > - * b) Accompany it with a written offer, valid for at least three years, > - to give any third party, for a charge no more than your cost of > - physically performing source distribution, a complete machine- > - readable copy of the corresponding source code, to be distributed > - under the terms of Sections 1 and 2 above on a medium customarily > - used for software interchange; or, > - > - * c) Accompany it with the information you received as to the offer to > - distribute corresponding source code. (This alternative is allowed > - only for noncommercial distribution and only if you received the > - program in object code or executable form with such an offer, in > - accord with Subsection b above.) > - > - The source code for a work means the preferred form of the work for > - making modifications to it. For an executable work, complete source code > - means all the source code for all modules it contains, plus any > - associated interface definition files, plus the scripts used to control > - compilation and installation of the executable. However, as a special > - exception, the source code distributed need not include anything that is > - normally distributed (in either source or binary form) with the major > - components (compiler, kernel, and so on) of the operating system on which > - the executable runs, unless that component itself accompanies the > - executable. > - > - If distribution of executable or object code is made by offering access > - to copy from a designated place, then offering equivalent access to copy > - the source code from the same place counts as distribution of the source > - code, even though third parties are not compelled to copy the source > - along with the object code. > - > -4. You may not copy, modify, sublicense, or distribute the Program except as > - expressly provided under this License. Any attempt otherwise to copy, > - modify, sublicense or distribute the Program is void, and will > - automatically terminate your rights under this License. However, parties > - who have received copies, or rights, from you under this License will not > - have their licenses terminated so long as such parties remain in full > - compliance. > - > -5. You are not required to accept this License, since you have not signed > - it. However, nothing else grants you permission to modify or distribute > - the Program or its derivative works. These actions are prohibited by law > - if you do not accept this License. Therefore, by modifying or > - distributing the Program (or any work based on the Program), you > - indicate your acceptance of this License to do so, and all its terms and > - conditions for copying, distributing or modifying the Program or works > - based on it. > - > -6. Each time you redistribute the Program (or any work based on the > - Program), the recipient automatically receives a license from the > - original licensor to copy, distribute or modify the Program subject to > - these terms and conditions. You may not impose any further restrictions > - on the recipients' exercise of the rights granted herein. You are not > - responsible for enforcing compliance by third parties to this License. > - > -7. If, as a consequence of a court judgment or allegation of patent > - infringement or for any other reason (not limited to patent issues), > - conditions are imposed on you (whether by court order, agreement or > - otherwise) that contradict the conditions of this License, they do not > - excuse you from the conditions of this License. If you cannot distribute > - so as to satisfy simultaneously your obligations under this License and > - any other pertinent obligations, then as a consequence you may not > - distribute the Program at all. For example, if a patent license would > - not permit royalty-free redistribution of the Program by all those who > - receive copies directly or indirectly through you, then the only way you > - could satisfy both it and this License would be to refrain entirely from > - distribution of the Program. > - > - If any portion of this section is held invalid or unenforceable under any > - particular circumstance, the balance of the section is intended to apply > - and the section as a whole is intended to apply in other circumstances. > - > - It is not the purpose of this section to induce you to infringe any > - patents or other property right claims or to contest validity of any > - such claims; this section has the sole purpose of protecting the > - integrity of the free software distribution system, which is implemented > - by public license practices. Many people have made generous contributions > - to the wide range of software distributed through that system in > - reliance on consistent application of that system; it is up to the > - author/donor to decide if he or she is willing to distribute software > - through any other system and a licensee cannot impose that choice. > - > - This section is intended to make thoroughly clear what is believed to be > - a consequence of the rest of this License. > - > -8. If the distribution and/or use of the Program is restricted in certain > - countries either by patents or by copyrighted interfaces, the original > - copyright holder who places the Program under this License may add an > - explicit geographical distribution limitation excluding those countries, > - so that distribution is permitted only in or among countries not thus > - excluded. In such case, this License incorporates the limitation as if > - written in the body of this License. > - > -9. The Free Software Foundation may publish revised and/or new versions of > - the General Public License from time to time. Such new versions will be > - similar in spirit to the present version, but may differ in detail to > - address new problems or concerns. > - > - Each version is given a distinguishing version number. If the Program > - specifies a version number of this License which applies to it and "any > - later version", you have the option of following the terms and > - conditions either of that version or of any later version published by > - the Free Software Foundation. If the Program does not specify a version > - number of this License, you may choose any version ever published by the > - Free Software Foundation. > - > -10. If you wish to incorporate parts of the Program into other free programs > - whose distribution conditions are different, write to the author to ask > - for permission. For software which is copyrighted by the Free Software > - Foundation, write to the Free Software Foundation; we sometimes make > - exceptions for this. Our decision will be guided by the two goals of > - preserving the free status of all derivatives of our free software and > - of promoting the sharing and reuse of software generally. > - > - NO WARRANTY > - > -11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY > - FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN > - OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES > - PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER > - EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED > - WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE > - ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH > - YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL > - NECESSARY SERVICING, REPAIR OR CORRECTION. > - > -12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING > - WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR > - REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR > - DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL > - DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM > - (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED > - INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF > - THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR > - OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. > - > -END OF TERMS AND CONDITIONS > - > -How to Apply These Terms to Your New Programs > - > -If you develop a new program, and you want it to be of the greatest > -possible use to the public, the best way to achieve this is to make it free > -software which everyone can redistribute and change under these terms. > - > -To do so, attach the following notices to the program. It is safest to > -attach them to the start of each source file to most effectively convey the > -exclusion of warranty; and each file should have at least the "copyright" > -line and a pointer to where the full notice is found. > - > -one line to give the program's name and an idea of what it does. > -Copyright (C) yyyy name of author > - > -This program is free software; you can redistribute it and/or modify it > -under the terms of the GNU General Public License as published by the Free > -Software Foundation; either version 2 of the License, or (at your option) > -any later version. > - > -This program is distributed in the hope that it will be useful, but WITHOUT > -ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or > -FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for > -more details. > - > -You should have received a copy of the GNU General Public License along with > -this program; if not, write to the Free Software Foundation, Inc., 59 > -Temple Place - Suite 330, Boston, MA 02111-1307, USA. > - > -Also add information on how to contact you by electronic and paper mail. > - > -If the program is interactive, make it output a short notice like this when > -it starts in an interactive mode: > - > -Gnomovision version 69, Copyright (C) year name of author Gnomovision comes > -with ABSOLUTELY NO WARRANTY; for details type 'show w'. This is free > -software, and you are welcome to redistribute it under certain conditions; > -type 'show c' for details. > - > -The hypothetical commands 'show w' and 'show c' should show the appropriate > -parts of the General Public License. Of course, the commands you use may be > -called something other than 'show w' and 'show c'; they could even be > -mouse-clicks or menu items--whatever suits your program. > - > -You should also get your employer (if you work as a programmer) or your > -school, if any, to sign a "copyright disclaimer" for the program, if > -necessary. Here is a sample; alter the names: > - > -Yoyodyne, Inc., hereby disclaims all copyright interest in the program > -'Gnomovision' (which makes passes at compilers) written by James Hacker. > - > -signature of Ty Coon, 1 April 1989 > -Ty Coon, President of Vice > - > -This General Public License does not permit incorporating your program into > -proprietary programs. If your program is a subroutine library, you may > -consider it more useful to permit linking proprietary applications with the > -library. If this is what you want to do, use the GNU Library General Public > -License instead of this License. > diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/Makefile b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/Makefile > deleted file mode 100644 > index 53814a4b8e41..000000000000 > --- a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/Makefile > +++ /dev/null > @@ -1,3 +0,0 @@ > -obj-$(CONFIG_HYPERV_INFINIBAND_ND) += hv_network_direct_144.o > - > -hv_network_direct_144-y := provider.o vmbus_rdma.o hvnd_addr.o > diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/hvnd_addr.c b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/hvnd_addr.c > deleted file mode 100644 > index c8c2295b63af..000000000000 > --- a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/hvnd_addr.c > +++ /dev/null > @@ -1,136 +0,0 @@ > -/* > - * Copyright (c) 2014, Microsoft Corporation. > - * > - * Author: > - * K. Y. Srinivasan <kys@microsoft.com> > - * > - * This program is free software; you can redistribute it and/or modify it > - * under the terms of the GNU General Public License version 2 as published > - * by the Free Software Foundation. > - * > - * This program is distributed in the hope that it will be useful, but > - * WITHOUT ANY WARRANTY; without even the implied warranty of > - * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or > - * NON INFRINGEMENT. See the GNU General Public License for more > - * details. > - * > - * Bug fixes/enhancements: Long Li <longli@microsoft.com> > - */ > - > -#include <linux/completion.h> > -#include <linux/module.h> > -#include <linux/errno.h> > -#include <linux/hyperv.h> > -#include <linux/efi.h> > -#include <linux/slab.h> > -#include <linux/cred.h> > -#include <linux/sched.h> > -#include <linux/types.h> > -#include <linux/completion.h> > -#include <linux/scatterlist.h> > -#include <rdma/ib_umem.h> > -#include <rdma/ib_addr.h> > - > -#include "vmbus_rdma.h" > - > - > -#include <linux/semaphore.h> > -#include <linux/fs.h> > -#include <linux/nls.h> > -#include <linux/workqueue.h> > -#include <linux/cdev.h> > -#include <linux/hyperv.h> > -#include <linux/sched.h> > -#include <linux/uaccess.h> > -#include <linux/miscdevice.h> > -#include <linux/hyperv.h> > - > - > -int hvnd_get_outgoing_rdma_addr(struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx, > - union nd_sockaddr_inet *og_addr) > -{ > - int ret; > - /* > - * Query the host and select the first address. > - */ > - struct pkt_query_addr_list pkt; > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - > - hvnd_init_hdr(&pkt.hdr, > - (sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1)), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST, 0, 0, 0); > - > - pkt.ioctl.in.version = ND_VERSION_1; > - pkt.ioctl.in.reserved = 0; > - pkt.ioctl.in.handle = uctx->adaptor_hdl; > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, > - sizeof(pkt), (u64)&pkt); > - > - if (ret) > - return ret; > - > - /* > - * Copy the address out. > - */ > - > - memcpy(og_addr, &pkt.ioctl.out[0], sizeof(*og_addr)); > - return 0; > - > -} > - > -static struct rdma_addr_client self; > - > -struct resolve_cb_context { > - struct rdma_dev_addr *addr; > - struct completion comp; > -}; > - > -void hvnd_addr_init(void) > -{ > - rdma_addr_register_client(&self); > -} > - > -void hvnd_addr_deinit(void) > -{ > - rdma_addr_unregister_client(&self); > -} > - > -static void resolve_cb(int status, struct sockaddr *src_addr, > - struct rdma_dev_addr *addr, void *context) > -{ > - memcpy(((struct resolve_cb_context *)context)->addr, addr, sizeof(struct > - rdma_dev_addr)); > - complete(&((struct resolve_cb_context *)context)->comp); > -} > - > -int hvnd_get_neigh_mac_addr(struct sockaddr *local, struct sockaddr *remote, > - char *mac_addr) > -{ > - struct rdma_dev_addr dev_addr; > - struct resolve_cb_context ctx; > - int ret; > - > - memset(&dev_addr, 0, sizeof(dev_addr)); > - dev_addr.net = &init_net; > - ctx.addr = &dev_addr; > - init_completion(&ctx.comp); > - > - ret = rdma_resolve_ip(&self, local, remote, &dev_addr, 1000, > - resolve_cb, &ctx); > - > - if (ret) { > - hvnd_error("rdma_resolve_ip failed ret=%d\n", ret); > - return ret; > - } > - > - wait_for_completion(&ctx.comp); > - memcpy(mac_addr, dev_addr.dst_dev_addr, ETH_ALEN); > - return ret; > -} > diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/mx_abi.h b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/mx_abi.h > deleted file mode 100644 > index 8f55cb408de0..000000000000 > --- a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/mx_abi.h > +++ /dev/null > @@ -1,232 +0,0 @@ > -/* > - * Copyright (c) 2005 Topspin Communications. All rights reserved. > - * Copyright (c) 2005 Cisco Systems. All rights reserved. > - * Copyright (c) 2005 PathScale, Inc. All rights reserved. > - * > - * This software is available to you under a choice of one of two > - * licenses. You may choose to be licensed under the terms of the GNU > - * General Public License (GPL) Version 2, available from the file > - * COPYING in the main directory of this source tree, or the > - * OpenIB.org BSD license below: > - * > - * Redistribution and use in source and binary forms, with or > - * without modification, are permitted provided that the following > - * conditions are met: > - * > - * - Redistributions of source code must retain the above > - * copyright notice, this list of conditions and the following > - * disclaimer. > - * > - * - Redistributions in binary form must reproduce the above > - * copyright notice, this list of conditions and the following > - * disclaimer in the documentation and/or other materials > - * provided with the distribution. > - * > - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, > - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF > - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND > - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS > - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN > - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN > - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE > - * SOFTWARE. > - * > - * KYS: made some modifications. > - */ > - > -#ifndef MX_ABI_H > -#define MX_ABI_H > - > - > - > -/* > - * Make sure that all structs defined in this file remain laid out so > - * that they pack the same way on 32-bit and 64-bit architectures (to > - * avoid incompatibility between 32-bit userspace and 64-bit kernels). > - * Specifically: > - * - Do not use pointer types -- pass pointers in UINT64 instead. > - * - Make sure that any structure larger than 4 bytes is padded to a > - * multiple of 8 bytes. Otherwise the structure size will be > - * different between 32-bit and 64-bit architectures. > - */ > - > -enum ibv_get_context_mappings { > - IBV_GET_CONTEXT_UAR, > - IBV_GET_CONTEXT_BF, > - IBV_GET_CONTEXT_MAPPING_MAX > -}; > - > -struct ibv_get_context_req { > - > - union nd_mapping mappings[IBV_GET_CONTEXT_MAPPING_MAX]; > -}; > - > -struct ibv_get_context_resp { > - > - // mmap UAR and BF > - struct nd_mapping_result mapping_results[IBV_GET_CONTEXT_MAPPING_MAX]; > - > - // mmap Blue Flame > - int bf_buf_size; > - int bf_offset; > - > - // mlx4_query_device result > - int max_qp_wr; > - int max_sge; > - int max_cqe; > - > - // general parameters > - u32 cqe_size; > - u32 vend_id; > - u16 dev_id; > - u16 bf_reg_size; > - u16 bf_regs_per_page; > - u16 reserved1; > - > - // ibv_cmd_get_context result > - u32 qp_tab_size; > - > - u32 reserved2; > -}; > - > -struct ibv_alloc_pd_resp { > - u64 pd_handle; > - u32 pdn; > - u32 reserved; > -}; > - > -struct ibv_reg_mr { > - u64 start; > - u64 length; > - u64 hca_va; > - u32 access_flags; > - u32 pdn; > - u64 pd_handle; > -}; > - > -struct ibv_reg_mr_resp { > - u64 mr_handle; > - u32 lkey; > - u32 rkey; > -}; > - > - > -enum mlx4_ib_create_cq_mapping { > - MLX4_IB_CREATE_CQ_BUF, > - MLX4_IB_CREATE_CQ_DB, > - MLX4_IB_CREATE_CQ_ARM_SN, // Windows specific > - MLX4_IB_CREATE_CQ_MAPPING_MAX > -}; > - > -#define MLX4_CQ_FLAGS_ARM_IN_KERNEL 1 > - > -struct ibv_create_cq { > - union nd_mapping mappings[MLX4_IB_CREATE_CQ_MAPPING_MAX]; > - u32 flags; > -}; > - > -struct ibv_create_cq_resp { > - struct nd_mapping_result mapping_results[MLX4_IB_CREATE_CQ_MAPPING_MAX]; > - u32 cqn; > - u32 cqe; > -}; > - > -enum mlx4_ib_create_srq_mappings { > - MLX4_IB_CREATE_SRQ_BUF, > - MLX4_IB_CREATE_SRQ_DB, > - MLX4_IB_CREATE_SRQ_MAPPINGS_MAX > -}; > - > -struct ibv_create_srq { > - union nd_mapping mappings[MLX4_IB_CREATE_SRQ_MAPPINGS_MAX]; > -}; > - > -struct ibv_create_srq_resp { > - struct nd_mapping_result mapping_results[MLX4_IB_CREATE_SRQ_MAPPINGS_MAX]; > -}; > - > -enum mlx4_ib_create_qp_mappings { > - MLX4_IB_CREATE_QP_BUF, > - MLX4_IB_CREATE_QP_DB, > - MLX4_IB_CREATE_QP_MAPPINGS_MAX > -}; > - > -struct ibv_create_qp { > - union nd_mapping mappings[MLX4_IB_CREATE_QP_MAPPINGS_MAX]; > - u8 log_sq_bb_count; > - u8 log_sq_stride; > - u8 sq_no_prefetch; > - u8 reserved; > -}; > - > -struct ibv_create_qp_resp { > - struct nd_mapping_result mapping_results[MLX4_IB_CREATE_QP_MAPPINGS_MAX]; > - // struct ib_uverbs_create_qp_resp > - u64 qp_handle; > - u32 qpn; > - u32 max_send_wr; > - u32 max_recv_wr; > - u32 max_send_sge; > - u32 max_recv_sge; > - u32 max_inline_data; > -}; > - > -enum ibv_qp_attr_mask { > - IBV_QP_STATE = 1 << 0, > - IBV_QP_CUR_STATE = 1 << 1, > - IBV_QP_EN_SQD_ASYNC_NOTIFY = 1 << 2, > - IBV_QP_ACCESS_FLAGS = 1 << 3, > - IBV_QP_PKEY_INDEX = 1 << 4, > - IBV_QP_PORT = 1 << 5, > - IBV_QP_QKEY = 1 << 6, > - IBV_QP_AV = 1 << 7, > - IBV_QP_PATH_MTU = 1 << 8, > - IBV_QP_TIMEOUT = 1 << 9, > - IBV_QP_RETRY_CNT = 1 << 10, > - IBV_QP_RNR_RETRY = 1 << 11, > - IBV_QP_RQ_PSN = 1 << 12, > - IBV_QP_MAX_QP_RD_ATOMIC = 1 << 13, > - IBV_QP_ALT_PATH = 1 << 14, > - IBV_QP_MIN_RNR_TIMER = 1 << 15, > - IBV_QP_SQ_PSN = 1 << 16, > - IBV_QP_MAX_DEST_RD_ATOMIC = 1 << 17, > - IBV_QP_PATH_MIG_STATE = 1 << 18, > - IBV_QP_CAP = 1 << 19, > - IBV_QP_DEST_QPN = 1 << 20 > -}; > - > -enum ibv_qp_state { > - IBV_QPS_RESET, > - IBV_QPS_INIT, > - IBV_QPS_RTR, > - IBV_QPS_RTS, > - IBV_QPS_SQD, > - IBV_QPS_SQE, > - IBV_QPS_ERR > -}; > - > - > -struct ibv_modify_qp_resp { > - enum ibv_qp_attr_mask attr_mask; > - u8 qp_state; > - u8 reserved[3]; > -}; > - > -struct ibv_create_ah_resp { > - u64 start; > -}; > - > -/* > - * Some mlx4 specific kernel definitions. Perhaps could be in > - * separate file. > - */ > - > -struct mlx4_ib_user_db_page { > - struct list_head list; > - struct ib_umem *umem; > - unsigned long user_virt; > - int refcnt; > -}; > - > - > -#endif /* MX_ABI_H */ > diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/provider.c b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/provider.c > deleted file mode 100644 > index 7115cf769fe0..000000000000 > --- a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/provider.c > +++ /dev/null > @@ -1,2961 +0,0 @@ > -/* > - * Copyright (c) 2014, Microsoft Corporation. > - * > - * Author: > - * K. Y. Srinivasan <kys@microsoft.com> > - * > - * This program is free software; you can redistribute it and/or modify it > - * under the terms of the GNU General Public License version 2 as published > - * by the Free Software Foundation. > - * > - * This program is distributed in the hope that it will be useful, but > - * WITHOUT ANY WARRANTY; without even the implied warranty of > - * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or > - * NON INFRINGEMENT. See the GNU General Public License for more > - * details. > - * > - * Bug fixes/enhancements: Long Li <longli@microsoft.com> > - */ > - > -#include <linux/module.h> > -#include <linux/moduleparam.h> > -#include <linux/device.h> > -#include <linux/netdevice.h> > -#include <linux/etherdevice.h> > -#include <linux/delay.h> > -#include <linux/errno.h> > -#include <linux/list.h> > -#include <linux/spinlock.h> > -#include <linux/ethtool.h> > -#include <linux/rtnetlink.h> > -#include <linux/inetdevice.h> > -#include <linux/io.h> > -#include <linux/hyperv.h> > -#include <linux/completion.h> > -#include <rdma/iw_cm.h> > -#include <rdma/ib_verbs.h> > -#include <rdma/ib_smi.h> > -#include <rdma/ib_umem.h> > -#include <rdma/ib_user_verbs.h> > -#include <rdma/rdma_cm.h> > -#include <linux/proc_fs.h> > -#include <linux/seq_file.h> > -#include "vmbus_rdma.h" > - > -/* > - * We are emulating mlx4. XXXKYS: May have to FIX. > - */ > -#include "./user.h" > - > -static struct hvnd_dev *g_nd_dev; // the one and only one > - > -int hvnd_log_level = HVND_ERROR; > -module_param(hvnd_log_level, int, 0644); > -MODULE_PARM_DESC(hvnd_log_level, > - "Logging level, 0 - Error (default), 1 - Warning, 2 - Info, 3 - Debug."); > - > -static int disable_cq_notify = 1; > -//static int disable_cq_notify = 0; > -module_param(disable_cq_notify, int, 0644); > -MODULE_PARM_DESC(disable_cq_notify, > - "Disable CQ notification, 0 - Enable, 1 - Disable (default)."); > - > -enum { > - MLX4_USER_DEV_CAP_64B_CQE = 1L << 0 > -}; > - > -#define HVND_NODE_DESC "vmbus-RDMA" > - > -#undef MLX4_IB_UVERBS_ABI_VERSION > -#define MLX4_IB_UVERBS_ABI_VERSION 4 > - > -struct mlx4_wqe_data_seg { > - __be32 byte_count; > - __be32 lkey; > - __be64 addr; > -}; > - > -/* return value: > - * true: ep is running > - * false: ep is stopped > - */ > -bool ep_add_work_pending(struct hvnd_ep_obj *ep_object) > -{ > - bool ret = true; > - > - atomic_inc(&ep_object->nr_requests_pending); > - if (ep_object->stopping) { > - if (atomic_dec_and_test(&ep_object->nr_requests_pending)) > - wake_up(&ep_object->wait_pending); > - ret = false; > - } > - return ret; > -} > - > -void ep_del_work_pending(struct hvnd_ep_obj *ep_object) > -{ > - if (atomic_dec_and_test(&ep_object->nr_requests_pending)) > - wake_up(&ep_object->wait_pending); > - > - if (atomic_read(&ep_object->nr_requests_pending) < 0) { > - hvnd_error("ep_object->nr_requests_pending=%d type=%d cm_state=%d\n", > - atomic_read(&ep_object->nr_requests_pending), > - ep_object->type, ep_object->cm_state); > - dump_stack(); > - } > -} > - > -void ep_stop(struct hvnd_ep_obj *ep_object) > -{ > - if (!ep_object->stopping) { > - ep_object->stopping = true; > - hvnd_cancel_io(ep_object); > - } > - > - if (atomic_read(&ep_object->nr_requests_pending) < 0) { > - hvnd_error("IO canceled, ep_object->nr_requests_pending=%d type=%d cm_state=%d\n", > - atomic_read(&ep_object->nr_requests_pending), > - ep_object->type, ep_object->cm_state); > - dump_stack(); > - } > - > - wait_event(ep_object->wait_pending, > - !atomic_read(&ep_object->nr_requests_pending)); > -} > - > -static int vmbus_dma_map_sg(struct device *dev, struct scatterlist *sgl, > - int nents, enum dma_data_direction direction, > - unsigned long attrs) > -{ > - struct scatterlist *sg; > - u64 addr; > - int i; > - int ret = nents; > - > - BUG_ON(!valid_dma_direction(direction)); > - > - for_each_sg(sgl, sg, nents, i) { > - addr = (u64) page_address(sg_page(sg)); > - /* TODO: handle highmem pages */ > - if (!addr) { > - ret = 0; > - break; > - } > - sg->dma_address = addr + sg->offset; > - sg->dma_length = sg->length; > - } > - return ret; > -} > - > -static void vmbus_dma_unmap_sg(struct device *dev, > - struct scatterlist *sg, int nents, > - enum dma_data_direction direction, unsigned long attrs) > -{ > - BUG_ON(!valid_dma_direction(direction)); > -} > - > - > -const struct dma_map_ops vmbus_dma_ops = { > - .map_sg = vmbus_dma_map_sg, > - .unmap_sg = vmbus_dma_unmap_sg, > -}; > - > -static int hvnd_get_incoming_connections(struct hvnd_ep_obj *listener, > - struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx); > - > -static struct hvnd_ep_obj *hvnd_setup_ep(struct iw_cm_id *cm_id, int ep_type, > - struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx); > - > -static void hvnd_deinit_ep(struct hvnd_ep_obj *ep) > -{ > - put_irp_handle(ep->nd_dev, ep->local_irp); > -} > - > -static void hvnd_destroy_ep(struct hvnd_ep_obj *ep) > -{ > - hvnd_debug("canceling work for ep %p\n", ep); > - cancel_work_sync(&ep->wrk.work); > - hvnd_deinit_ep(ep); > - kfree(ep); > -} > - > - > -#define UC(b) (((int)b)&0xff) > -char *debug_inet_ntoa(struct in_addr in, char *b) > -{ > - register char *p; > - > - p = (char *)∈ > - (void)snprintf(b, 20, > - "%d.%d.%d.%d", UC(p[0]), UC(p[1]), UC(p[2]), UC(p[3])); > - return b; > -} > - > -void hvnd_process_events(struct work_struct *work); > - > -static int hvnd_init_ep(struct hvnd_ep_obj *ep_object, > - struct iw_cm_id *cm_id, int ep_type, > - struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx) > -{ > - int ret; > - > - ep_object->type = ep_type; > - ep_object->cm_id = cm_id; > - ep_object->nd_dev = nd_dev; > - ep_object->uctx = uctx; > - > - ep_object->parent = NULL; > - > - ep_object->wrk.callback_arg = ep_object; > - INIT_WORK(&ep_object->wrk.work, hvnd_process_events); > - INIT_LIST_HEAD(&ep_object->incoming_pkt_list); > - spin_lock_init(&ep_object->incoming_pkt_list_lock); > - > - ep_object->stopping = false; > - atomic_set(&ep_object->nr_requests_pending, 0); > - init_waitqueue_head(&ep_object->wait_pending); > - > - ret = get_irp_handle(nd_dev, &ep_object->local_irp, (void *)ep_object); > - > - if (ret) { > - hvnd_error("get_irp_handle() failed: err: %d\n", ret); > - return ret; > - } > - return 0; > - > -} > - > -static int set_rq_size(struct hvnd_dev *dev, struct ib_qp_cap *cap, > - struct hvnd_qp *qp) > -{ > - > - /* HW requires >= 1 RQ entry with >= 1 gather entry */ > - if (!cap->max_recv_wr || !cap->max_recv_sge) > - return -EINVAL; > - > - qp->rq_wqe_cnt = roundup_pow_of_two(max(1U, cap->max_recv_wr)); > - qp->rq_max_gs = roundup_pow_of_two(max(1U, cap->max_recv_sge)); > - qp->rq_wqe_shift = ilog2( > - qp->rq_max_gs * sizeof(struct mlx4_wqe_data_seg)); > - > - > - return 0; > -} > - > -static int set_user_sq_size(struct hvnd_dev *dev, > - struct hvnd_qp *qp, > - struct mlx4_ib_create_qp *ucmd) > -{ > - qp->sq_wqe_cnt = 1 << ucmd->log_sq_bb_count; > - qp->sq_wqe_shift = ucmd->log_sq_stride; > - > - qp->buf_size = (qp->rq_wqe_cnt << qp->rq_wqe_shift) + > - (qp->sq_wqe_cnt << qp->sq_wqe_shift); > - > - return 0; > -} > - > -static int hvnd_db_map_user(struct hvnd_ucontext *uctx, unsigned long virt, > - struct ib_umem **db_umem) > -{ > - struct mlx4_ib_user_db_page *page; > - int err = 0; > - > - mutex_lock(&uctx->db_page_mutex); > - > - list_for_each_entry(page, &uctx->db_page_list, list) > - if (page->user_virt == (virt & PAGE_MASK)) > - goto found; > - > - page = kmalloc(sizeof(*page), GFP_KERNEL); > - if (!page) { > - err = -ENOMEM; > - goto out; > - } > - > - page->user_virt = (virt & PAGE_MASK); > - page->refcnt = 0; > - page->umem = ib_umem_get(&uctx->ibucontext, virt & PAGE_MASK, > - PAGE_SIZE, 0, 0); > - if (IS_ERR(page->umem)) { > - hvnd_error("ib_umem_get failure\n"); > - err = PTR_ERR(page->umem); > - kfree(page); > - goto out; > - } > - > - list_add(&page->list, &uctx->db_page_list); > - > -found: > - ++page->refcnt; > -out: > - mutex_unlock(&uctx->db_page_mutex); > - if (!err) > - *db_umem = page->umem; > - > - return err; > -} > - > -static void hvnd_db_unmap_user(struct hvnd_ucontext *uctx, u64 db_addr) > -{ > - struct mlx4_ib_user_db_page *page; > - > - mutex_lock(&uctx->db_page_mutex); > - list_for_each_entry(page, &uctx->db_page_list, list) > - if (page->user_virt == (db_addr & PAGE_MASK)) > - goto found; > - > -found: > - if (!--page->refcnt) { > - list_del(&page->list); > - ib_umem_release(page->umem); > - kfree(page); > - } > - > - mutex_unlock(&uctx->db_page_mutex); > -} > - > - > -static void debug_check(const char *func, int line) > -{ > - hvnd_debug("func is: %s; line is %d\n", func, line); > - > - if (in_interrupt()) { > - hvnd_error("In interrupt func is: %s; line is %d\n", > - func, line); > - return; > - } > -} > - > -static struct ib_ah *hvnd_ah_create(struct ib_pd *pd, > - struct ib_ah_attr *ah_attr, > - struct ib_udata *udata) > -{ > - debug_check(__func__, __LINE__); > - return ERR_PTR(-ENOSYS); > -} > - > -static int hvnd_ah_destroy(struct ib_ah *ah) > -{ > - debug_check(__func__, __LINE__); > - return -ENOSYS; > -} > - > -static int hvnd_multicast_attach(struct ib_qp *ibqp, union ib_gid *gid, u16 lid) > -{ > - debug_check(__func__, __LINE__); > - return -ENOSYS; > -} > - > -static int hvnd_multicast_detach(struct ib_qp *ibqp, union ib_gid *gid, u16 lid) > -{ > - debug_check(__func__, __LINE__); > - return -ENOSYS; > -} > - > -static int hvnd_process_mad(struct ib_device *ibdev, > - int mad_flags, > - u8 port_num, > - const struct ib_wc *in_wc, > - const struct ib_grh *in_grh, > - const struct ib_mad_hdr *in_mad, > - size_t in_mad_size, > - struct ib_mad_hdr *out_mad, > - size_t *out_mad_size, > - u16 *out_mad_pkey_index) > -{ > - debug_check(__func__, __LINE__); > - return -ENOSYS; > -} > - > -void hvnd_acquire_uctx_ref(struct hvnd_ucontext *uctx) > -{ > - atomic_inc(&uctx->refcnt); > -} > - > -void hvnd_drop_uctx_ref(struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx) > -{ > - if (atomic_dec_and_test(&uctx->refcnt)) { > - hvnd_debug("uctx ref cnt dropped it is %d\n", > - atomic_read(&uctx->refcnt)); > - hvnd_debug("About to close adaptor\n"); > - hvnd_close_adaptor(nd_dev, uctx); > - } else > - hvnd_debug("uctx ref cnt dropped it is %d\n", > - atomic_read(&uctx->refcnt)); > -} > - > - > -static int hvnd_dealloc_ucontext(struct ib_ucontext *context) > -{ > - struct hvnd_dev *nd_dev; > - struct hvnd_ucontext *uctx; > - > - uctx = to_nd_context(context); > - nd_dev = to_nd_dev(context->device); > - > - hvnd_debug("calling %s\n", __func__); > - > - hvnd_drop_uctx_ref(nd_dev, uctx); > - > - return 0; > -} > - > -static struct ib_ucontext *hvnd_alloc_ucontext(struct ib_device *ibdev, > - struct ib_udata *udata) > -{ > - struct hvnd_dev *nd_dev = to_nd_dev(ibdev); > - struct hvnd_ucontext *uctx; > - struct mlx4_ib_alloc_ucontext_resp resp; > - struct ibv_get_context_resp *ctx_output; > - int ret; > - > - if (!nd_dev->ib_active) { > - hvnd_error("ib device is not active, try again\n"); > - return ERR_PTR(-EAGAIN); > - } > - > - uctx = get_uctx(nd_dev, current_pid()); > - if (uctx) { > - // it is already opened, just increase its reference count > - hvnd_acquire_uctx_ref(uctx); > - } else { > - > - /* > - * The Windows host expects the following to be done: > - * 1. Successfully send struct ndv_pkt_hdr_create_1 > - * 2. INIT PROVIDER > - * 3. Open Adapter > - * Before we can complete this call. > - */ > - > - uctx = kzalloc(sizeof(struct hvnd_ucontext), GFP_KERNEL); > - if (!uctx) > - return ERR_PTR(-ENOMEM); > - > - atomic_set(&uctx->refcnt, 1); > - INIT_LIST_HEAD(&uctx->db_page_list); > - mutex_init(&uctx->db_page_mutex); > - > - /* > - * Stash away the context with the calling PID. > - */ > - ret = insert_handle(nd_dev, &nd_dev->uctxidr, uctx, > - current_pid()); > - if (ret) { > - hvnd_error("Uctx ID insertion failed; ret is %d\n", > - ret); > - goto err1; > - } > - > - hvnd_debug("Opening adaptor pid is %d\n", current_pid()); > - > - ret = hvnd_open_adaptor(nd_dev, uctx); > - if (ret) { > - hvnd_error("hvnd_open_adaptor failed ret=%d\n", ret); > - goto err1; > - } > - > - } > - > - /* > - * Copy the response out. > - */ > - ctx_output = &uctx->o_adap_pkt.mappings.ctx_output; > - resp.dev_caps = MLX4_USER_DEV_CAP_64B_CQE; > - resp.qp_tab_size = ctx_output->qp_tab_size; > - resp.bf_reg_size = ctx_output->bf_reg_size; > - resp.bf_regs_per_page = ctx_output->bf_regs_per_page; > - resp.cqe_size = ctx_output->cqe_size; > - > - ret = ib_copy_to_udata(udata, &resp, sizeof(resp)); > - if (ret) { > - hvnd_error("ib_copy_to_udata failed ret=%d\n", ret); > - goto err1; > - } > - > - return &uctx->ibucontext; > - > -err1: > - kfree(uctx); > - return ERR_PTR(ret); > -} > - > -static int hvnd_mmap(struct ib_ucontext *context, struct vm_area_struct *vma) > -{ > - struct hvnd_ucontext *uctx = to_nd_context(context); > - > - if (vma->vm_end - vma->vm_start != PAGE_SIZE) { > - hvnd_error("vma not a page size, actual size=%lu\n", > - vma->vm_end - vma->vm_start); > - return -EINVAL; > - } > - > - if (vma->vm_pgoff == 0) { > - vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); > - > - if (io_remap_pfn_range(vma, vma->vm_start, > - (uctx->uar_base >> PAGE_SHIFT), > - PAGE_SIZE, vma->vm_page_prot)) { > - hvnd_error("io_remap_pfn_range failure\n"); > - return -EAGAIN; > - } > - } else if (vma->vm_pgoff == 1 && uctx->bf_buf_size != 0) { > - vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot); > - > - if (io_remap_pfn_range(vma, vma->vm_start, > - (uctx->uar_base >> PAGE_SHIFT) + 1, > - PAGE_SIZE, vma->vm_page_prot)) { > - hvnd_error("io_remap_pfn_range failure\n"); > - return -EAGAIN; > - } > - } else { > - hvnd_error("check code\n"); > - return -EINVAL; > - } > - > - return 0; > -} > - > -static int hvnd_deallocate_pd(struct ib_pd *pd) > -{ > - > - struct hvnd_ucontext *uctx; > - struct hvnd_dev *nd_dev; > - struct hvnd_ib_pd *hvnd_pd; > - struct ib_ucontext *ibuctx = pd->uobject->context; > - > - hvnd_pd = to_nd_pd(pd); > - nd_dev = to_nd_dev(pd->device); > - uctx = to_nd_context(ibuctx); > - > - hvnd_free_handle(nd_dev, uctx, hvnd_pd->handle, > - IOCTL_ND_PD_FREE); > - > - hvnd_drop_uctx_ref(nd_dev, uctx); > - return 0; > -} > - > -static struct ib_pd *hvnd_allocate_pd(struct ib_device *ibdev, > - struct ib_ucontext *context, > - struct ib_udata *udata) > -{ > - struct hvnd_ucontext *uctx; > - struct hvnd_dev *nd_dev; > - int ret; > - struct hvnd_ib_pd *hvnd_pd; > - > - if (!context) { > - hvnd_error("kernel mode context not supported\n"); > - return ERR_PTR(-EINVAL); > - } > - > - hvnd_pd = kzalloc(sizeof(struct hvnd_ib_pd), GFP_KERNEL); > - > - if (!hvnd_pd) > - return ERR_PTR(-ENOMEM); > - > - uctx = to_nd_context(context); > - nd_dev = to_nd_dev(ibdev); > - > - ret = hvnd_create_pd(uctx, nd_dev, hvnd_pd); > - if (ret) { > - hvnd_error("hvnd_create_pd failure ret=%d\n", ret); > - goto error_cr_pd; > - } > - > - if (context) { > - if (ib_copy_to_udata(udata, &hvnd_pd->pdn, sizeof(__u32))) { > - hvnd_error("ib_copy_to_udata failure\n"); > - ret = -EFAULT; > - goto error_fault; > - } > - } > - > - hvnd_acquire_uctx_ref(uctx); > - return &hvnd_pd->ibpd; > - > -error_fault: > - hvnd_free_handle(nd_dev, uctx, hvnd_pd->handle, > - IOCTL_ND_PD_FREE); > - > -error_cr_pd: > - kfree(hvnd_pd); > - return ERR_PTR(ret); > -} > - > -static int hvnd_query_pkey(struct ib_device *ibdev, u8 port, u16 index, > - u16 *pkey) > -{ > - debug_check(__func__, __LINE__); > - *pkey = 0; > - return 0; > -} > - > -static int hvnd_query_gid(struct ib_device *ibdev, u8 port, int index, > - union ib_gid *gid) > -{ > - int ret; > - struct hvnd_dev *nd_dev = to_nd_dev(ibdev); > - > - debug_check(__func__, __LINE__); > - > - ret = wait_for_completion_timeout(&nd_dev->addr_set, 60*HZ); > - if (!ret) > - return -ETIMEDOUT; > - > - memset(&(gid->raw[0]), 0, sizeof(gid->raw)); > - memcpy(&(gid->raw[0]), nd_dev->mac_addr, 6); > - return 0; > -} > - > -static int hvnd_query_device(struct ib_device *ibdev, > - struct ib_device_attr *props, > - struct ib_udata *udata) > -{ > - struct hvnd_dev *nd_dev = to_nd_dev(ibdev); > - struct adapter_info_v2 *adap_info; > - > - if (!nd_dev->query_pkt_set) { > - hvnd_error("query packet not received yet\n"); > - return 0; > - } > - > - adap_info = &nd_dev->query_pkt.ioctl.ad_info; > - > - memset(props, 0, sizeof(*props)); > - > - /* > - * Copy the relevant properties out. > - */ > - props->fw_ver = 0; > - props->device_cap_flags = 0; > - //props->device_cap_flags |= IB_DEVICE_BAD_PKEY_CNTR; > - //props->device_cap_flags |= IB_DEVICE_BAD_QKEY_CNTR; > - //props->device_cap_flags |= IB_DEVICE_XRC; > - > - props->vendor_id = 0x15b3; > - props->vendor_part_id = adap_info->device_id; > - > - props->max_mr_size = ~0ull; > - props->page_size_cap = PAGE_SIZE; > - props->max_qp = 16384; > - props->max_qp_wr = min(adap_info->max_recv_q_depth, > - adap_info->max_initiator_q_depth); > - > - props->max_sge = min(adap_info->max_initiator_sge, > - adap_info->max_recv_sge); > - props->max_cq = 0x1FFFF; > - props->max_cqe = adap_info->max_completion_q_depth; > - props->max_mr = 16384; > - props->max_pd = 16384; > - > - props->max_qp_rd_atom = adap_info->max_inbound_read_limit; > - props->max_qp_init_rd_atom = adap_info->max_outbound_read_limit; > - props->max_res_rd_atom = props->max_qp_rd_atom * props->max_qp; > - props->max_srq = 16384; > - props->max_srq_wr = adap_info->max_recv_q_depth; > - props->max_srq_sge = adap_info->max_recv_sge; > - > - return 0; > -} > - > -static int hvnd_query_port(struct ib_device *ibdev, u8 port, > - struct ib_port_attr *props) > -{ > - memset(props, 0, sizeof(struct ib_port_attr)); > - > - props->max_mtu = IB_MTU_4096; > - props->active_mtu = IB_MTU_4096; > - > - /* > - * KYS: TBD need to base this on netdev. > - */ > - props->state = IB_PORT_ACTIVE; > - > - props->port_cap_flags = IB_PORT_CM_SUP; > - > - props->gid_tbl_len = 1; > - props->pkey_tbl_len = 1; > - props->active_width = 1; > - props->active_speed = IB_SPEED_DDR; //KYS: check > - props->max_msg_sz = -1; > - > - return 0; > -} > - > -static enum rdma_link_layer > -hvnd_get_link_layer(struct ib_device *device, u8 port) > -{ > - return IB_LINK_LAYER_ETHERNET; > -} > - > -static ssize_t hvnd_show_rev(struct device *dev, struct device_attribute *attr, > - char *buf) > -{ > - return 0; > -} > - > -static ssize_t hvnd_show_fw_ver(struct device *dev, > - struct device_attribute *attr, char *buf) > -{ > - return 0; > -} > - > -static ssize_t hvnd_show_hca(struct device *dev, struct device_attribute *attr, > - char *buf) > -{ > - return 0; > -} > - > -static ssize_t hvnd_show_board(struct device *dev, > - struct device_attribute *attr, char *buf) > -{ > - return 0; > -} > - > -static int hvnd_get_port_immutable(struct ib_device *ibdev, u8 port_num, > - struct ib_port_immutable *immutable) > -{ > - struct ib_port_attr attr; > - int err; > - > - err = hvnd_query_port(ibdev, port_num, &attr); > - if (err) > - return err; > - > - immutable->pkey_tbl_len = attr.pkey_tbl_len; > - immutable->gid_tbl_len = attr.gid_tbl_len; > - immutable->core_cap_flags = RDMA_CORE_PORT_IWARP; > - > - return 0; > -} > - > -static struct ib_qp *hvnd_ib_create_qp(struct ib_pd *pd, > - struct ib_qp_init_attr *attrs, > - struct ib_udata *udata) > -{ > - struct hvnd_ucontext *uctx; > - struct hvnd_dev *nd_dev; > - struct mlx4_ib_create_qp ucmd; > - struct hvnd_qp *qp; > - int ret = 0; > - struct hvnd_ib_pd *hvnd_pd = to_nd_pd(pd); > - struct hvnd_cq *send_cq = to_nd_cq(attrs->send_cq); > - struct hvnd_cq *recv_cq = to_nd_cq(attrs->recv_cq); > - > - uctx = get_uctx_from_pd(pd); > - nd_dev = to_nd_dev(pd->device); > - > - if (attrs->qp_type != IB_QPT_RC) { > - hvnd_error("attrs->qp_type=%d not IB_QPT_RC\n", attrs->qp_type); > - return ERR_PTR(-EINVAL); > - } > - > - qp = kzalloc(sizeof(*qp), GFP_KERNEL); > - if (!qp) { > - ret = -ENOMEM; > - goto err_done; > - } > - > - qp->uctx = uctx; > - > - if (ib_copy_from_udata(&ucmd, udata, sizeof(ucmd))) { > - hvnd_error("ib_copy_from_udata failed\n"); > - ret = -EFAULT; > - goto err_ucpy; > - } > - > - qp->qp_buf = (void *)ucmd.buf_addr; > - qp->db_addr = (void *)ucmd.db_addr; > - qp->log_sq_bb_count = ucmd.log_sq_bb_count; > - qp->log_sq_stride = ucmd.log_sq_stride; > - qp->sq_no_prefetch = ucmd.sq_no_prefetch; > - qp->port = attrs->port_num; > - > - init_waitqueue_head(&qp->wait); > - atomic_set(&qp->refcnt, 1); > - > - qp->recv_cq = recv_cq; > - qp->send_cq = send_cq; > - qp->nd_dev = nd_dev; > - > - qp->receive_cq_handle = recv_cq->cq_handle; > - qp->initiator_cq_handle = send_cq->cq_handle; > - qp->pd_handle = hvnd_pd->handle; > - qp->cq_notify = false; > - > - qp->ibqp.qp_num = attrs->qp_type == IB_QPT_SMI ? 0 : 1; > - > - qp->max_inline_data = attrs->cap.max_inline_data; > - > - qp->initiator_q_depth = attrs->cap.max_send_wr; > - qp->initiator_request_sge = attrs->cap.max_send_sge; > - > - > - qp->receive_q_depth = attrs->cap.max_recv_wr; > - qp->receive_request_sge = attrs->cap.max_recv_sge; > - > - set_rq_size(nd_dev, &attrs->cap, qp); > - > - set_user_sq_size(nd_dev, qp, &ucmd); > - > - qp->umem = ib_umem_get(&uctx->ibucontext, ucmd.buf_addr, > - qp->buf_size, 0, 0); > - if (IS_ERR(qp->umem)) { > - ret = PTR_ERR(qp->umem); > - hvnd_error("ib_umem_get failed ret=%d\n", ret); > - goto err_ucpy; > - } > - > - ret = hvnd_db_map_user(uctx, ucmd.db_addr, &qp->db_umem); > - > - if (ret) { > - hvnd_error("hvnd_db_map_user failed ret=%d\n", ret); > - goto err_db_map; > - } > - > - ret = hvnd_create_qp(nd_dev, uctx, qp); > - > - if (ret) { > - hvnd_error("hvnd_create_qp failed ret=%d\n", ret); > - goto err_qp; > - } > - > - hvnd_acquire_uctx_ref(uctx); > - > - qp->ibqp.qp_num = qp->qpn; > - qp->ibqp.qp_type = IB_QPT_RC; > - > - > - return &qp->ibqp; > - > -err_qp: > - hvnd_db_unmap_user(uctx, ucmd.db_addr); > - > -err_db_map: > - ib_umem_release(qp->umem); > - > -err_ucpy: > - kfree(qp); > -err_done: > - return ERR_PTR(ret); > -} > - > -static int hvnd_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, > - int attr_mask, struct ib_udata *udata) > -{ > - struct hvnd_qp *qp = to_nd_qp(ibqp); > - struct hvnd_dev *nd_dev = to_nd_dev(ibqp->device); > - enum ib_qp_state cur_state, new_state; > - int ret = 0; > - > - if (attr != NULL) { > - cur_state = (attr_mask & IB_QP_CUR_STATE) > - ? attr->cur_qp_state : qp->qp_state; > - new_state = (attr_mask & IB_QP_STATE) > - ? attr->qp_state : cur_state; > - > - hvnd_debug("qp->qp_state is %d new state is %d\n", > - qp->qp_state, new_state); > - hvnd_debug("current qp state is %d\n", cur_state); > - if (attr_mask & IB_QP_STATE) { > - /* Ensure the state is valid */ > - if (attr->qp_state < 0 || attr->qp_state > IB_QPS_ERR) { > - hvnd_error("incorrect qp state attr->qp_state=%d\n", > - attr->qp_state); > - return -EINVAL; > - } > - > - if (qp->qp_state != new_state) { > - qp->qp_state = new_state; > - /* > - * The only state transition supported is the > - * transition to error state. > - */ > - switch (new_state) { > - case IB_QPS_ERR: > - case IB_QPS_SQD: > - ret = hvnd_flush_qp(nd_dev, qp->uctx, > - qp); > - > - if (ret) > - hvnd_error("hvnd_flush_qp failed ret=%d\n", ret); > - > - // immediately notify the upper layer > - // on disconnection > - if (!ret && qp->connector) > - hvnd_process_notify_disconnect( > - qp->connector, > - STATUS_SUCCESS); > - > - return ret; > - > - default: > - break; > - } > - } > - } > - } > - return 0; > -} > - > - > -static int hvnd_ib_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, > - int attr_mask, struct ib_qp_init_attr *init_attr) > -{ > - struct hvnd_qp *qp = to_nd_qp(ibqp); > - > - memset(attr, 0, sizeof(*attr)); > - memset(init_attr, 0, sizeof(*init_attr)); > - > - attr->qp_state = qp->qp_state; > - > - init_attr->cap.max_send_wr = qp->max_send_wr; > - init_attr->cap.max_recv_wr = qp->max_recv_wr; > - > - init_attr->cap.max_send_sge = qp->max_send_sge; > - init_attr->cap.max_recv_sge = qp->max_recv_sge; > - init_attr->cap.max_inline_data = qp->max_inline_data; > - > - init_attr->sq_sig_type = IB_SIGNAL_ALL_WR; > - > - return 0; > -} > - > -static void hvnd_refuse_connection(struct hvnd_ep_obj *connector, int status); > -static int hvnd_destroy_qp(struct ib_qp *ib_qp) > -{ > - int ret; > - struct hvnd_qp *qp = to_nd_qp(ib_qp); > - struct hvnd_dev *nd_dev = to_nd_dev(ib_qp->device); > - u64 jiffies; > - > - if (!qp->connector) { > - hvnd_warn("error: connector is NULL; skip destroying connector\n"); > - goto free_qp; > - } > - > - /* > - * should we flush the qp first on ctrl-C?, > - * no need to disconnect on abrupt shutdown? > - */ > - if (qp->qp_state != IB_QPS_ERR && qp->qp_state != IB_QPS_SQD) { > - hvnd_warn("qp_state=%d, doing abrupt disconnect\n", > - qp->qp_state); > - hvnd_flush_qp(nd_dev, qp->uctx, qp); > - > - ep_stop(qp->connector); > - > - // now no pending activity is possible on the connector > - > - switch (qp->connector->cm_state) { > - > - case hvnd_cm_idle: > - case hvnd_cm_connect_reply_refused: > - case hvnd_cm_connect_request_sent: > - case hvnd_cm_close_sent: > - hvnd_warn("cm_state = %d not doing anything\n", > - qp->connector->cm_state); > - break; > - > - case hvnd_cm_connect_received: > - hvnd_warn("cm_state = %d refusing pending connection request\n", > - qp->connector->cm_state); > - hvnd_refuse_connection(qp->connector, -ECONNREFUSED); > - break; > - > - case hvnd_cm_connect_reply_sent: > - case hvnd_cm_established_sent: > - case hvnd_cm_accept_sent: > - hvnd_warn("cm_state = %d notifying disconnect on existing connection\n", > - qp->connector->cm_state); > - hvnd_process_notify_disconnect(qp->connector, > - STATUS_CANCELLED); > - break; > - > - default: > - hvnd_error("unknown cm_state = %d\n", > - qp->connector->cm_state); > - > - } > - goto free_connector; > - } else { > - hvnd_debug("qp_state=%d, doing normal disconnect\n", > - qp->qp_state); > - } > - > - if (!ep_add_work_pending(qp->connector)) > - goto free_connector; > - > - init_completion(&qp->connector->disconnect_event); > - > - /* > - * First issue a disconnect on the connector. > - */ > - > - hvnd_debug("calling hvnd_connector_disconnect\n"); > - ret = hvnd_connector_disconnect(nd_dev, qp->uctx, > - qp->connector->ep_handle, > - qp->connector); > - if (ret) { > - ep_del_work_pending(qp->connector); > - hvnd_error("disconnect: retval is %d\n", ret); > - ep_stop(qp->connector); > - goto free_connector; > - } > - /* > - * Now wait for the disconnect. > - */ > - jiffies = get_jiffies_64(); > - wait_for_completion(&qp->connector->disconnect_event); > - hvnd_debug("Completed disconnect connector=%p jiffies=%llu\n", > - qp->connector, get_jiffies_64() - jiffies); > - > - /* > - * Now free up the connector and drop the reference on uctx. > - */ > - > - ep_stop(qp->connector); > - > -free_connector: > - hvnd_debug("destroying connector handle: %p\n", > - (void *) qp->connector->ep_handle); > - hvnd_free_handle(nd_dev, qp->uctx, > - qp->connector->ep_handle, > - IOCTL_ND_CONNECTOR_FREE); > - > - hvnd_drop_uctx_ref(nd_dev, qp->uctx); > - hvnd_destroy_ep(qp->connector); > - qp->connector = NULL; > -free_qp: > - atomic_dec(&qp->refcnt); > - hvnd_debug("Waiting for the ref cnt to go to 0\n"); > - > - wait_event(qp->wait, !atomic_read(&qp->refcnt)); > - > - hvnd_debug("About to destroy qp\n"); > - hvnd_db_unmap_user(qp->uctx, (u64)qp->db_addr); > - ib_umem_release(qp->umem); > - > - hvnd_debug("About to free qp\n"); > - ret = hvnd_free_qp(nd_dev, qp->uctx, qp); > - > - if (ret == 0) { > - hvnd_drop_uctx_ref(nd_dev, qp->uctx); > - kfree(qp); > - } else { > - hvnd_error("free qp failed: ret is %d\n", ret); > - } > - > - return ret; > -} > - > -static struct ib_cq *hvnd_ib_create_cq(struct ib_device *ibdev, > - const struct ib_cq_init_attr *attr, > - struct ib_ucontext *ib_context, > - struct ib_udata *udata) > -{ > - struct hvnd_ucontext *uctx; > - struct hvnd_dev *nd_dev; > - struct mlx4_ib_create_cq ucmd; > - struct hvnd_cq *cq; > - int ret = 0; > - int entries = attr->cqe; > - > - uctx = to_nd_context(ib_context); > - nd_dev = to_nd_dev(ibdev); > - > - if (entries < 1 || entries > uctx->max_cqe) { > - hvnd_error("incorrct entries=%d\n", entries); > - ret = -EINVAL; > - goto err_done; > - } > - > - cq = kzalloc(sizeof(*cq), GFP_KERNEL); > - if (!cq) { > - ret = -ENOMEM; > - goto err_done; > - } > - > - entries = roundup_pow_of_two(entries + 1); > - cq->ibcq.cqe = entries - 1; > - cq->entries = entries; > - cq->uctx = uctx; > - > - if (ib_copy_from_udata(&ucmd, udata, sizeof(ucmd))) { > - hvnd_error("ib_copy_from_udata failed\n"); > - ret = -EFAULT; > - goto err_ucpy; > - } > - > - cq->cq_buf = (void *)ucmd.buf_addr; > - cq->db_addr = (void *)ucmd.db_addr; > - cq->arm_sn = 0; > - > - /* > - * Initialize the IRP state. Need to have a separate irp state > - * for CQ; for now share it with Listener/connector. > - */ > - ret = hvnd_init_ep(&cq->ep_object, NULL, ND_CQ, nd_dev, uctx); > - > - if (ret) { > - hvnd_error("hvnd_init_ep failed ret=%d\n", ret); > - goto err_ucpy; > - } > - > - cq->ep_object.cq = cq; > - cq->monitor = true; > - > - cq->umem = ib_umem_get(ib_context, ucmd.buf_addr, > - (entries * uctx->cqe_size), > - IB_ACCESS_LOCAL_WRITE, 1); > - if (IS_ERR(cq->umem)) { > - ret = IS_ERR(cq->umem); > - hvnd_error("ib_umem_get failed ret=%d\n", ret); > - goto err_ucpy; > - } > - > - ret = hvnd_db_map_user(uctx, ucmd.db_addr, &cq->db_umem); > - > - if (ret) { > - hvnd_error("hvnd_db_map_user failed ret=%d\n", ret); > - goto err_db_map; > - } > - > - ret = hvnd_create_cq(nd_dev, uctx, cq); > - > - if (ret) { > - hvnd_error("hvnd_create_cq failed ret=%d\n", ret); > - goto err_cq; > - } > - > - cq->ep_object.ep_handle = cq->cq_handle; > - > - if (ib_copy_to_udata(udata, &cq->cqn, sizeof(__u32))) { > - hvnd_error("ib_copy_to_udata failed\n"); > - ret = -EFAULT; > - goto err_ucpy_out; > - } > - > - if (!disable_cq_notify) { > - > - if (!ep_add_work_pending(&cq->ep_object)) > - goto err_ucpy_out; > - > - ret = hvnd_notify_cq(nd_dev, cq, ND_CQ_NOTIFY_ANY, > - (u64)&cq->ep_object); > - > - if (ret) { > - ep_del_work_pending(&cq->ep_object); > - hvnd_error("hvnd_notify_cq failed ret=%d\n", ret); > - goto err_ucpy_out; > - } > - } > - > - hvnd_acquire_uctx_ref(uctx); > - > - return &cq->ibcq; > - > -err_ucpy_out: > - hvnd_destroy_cq(nd_dev, cq); > - > -err_cq: > - hvnd_db_unmap_user(uctx, ucmd.db_addr); > - > -err_db_map: > - ib_umem_release(cq->umem); > - > -err_ucpy: > - kfree(cq); > -err_done: > - return ERR_PTR(ret); > -} > - > -static struct ib_qp *hvnd_get_qp(struct ib_device *dev, int qpn) > -{ > - struct hvnd_dev *nd_dev; > - struct hvnd_qp *qp = NULL; > - > - nd_dev = to_nd_dev(dev); > - qp = get_qpp(nd_dev, qpn); > - return qp ? &qp->ibqp : NULL; > -} > - > -static int hvnd_ib_destroy_cq(struct ib_cq *ib_cq) > -{ > - struct hvnd_ucontext *uctx; > - struct hvnd_dev *nd_dev; > - struct hvnd_cq *cq; > - > - cq = to_nd_cq(ib_cq); > - uctx = cq->uctx; > - nd_dev = to_nd_dev(uctx->ibucontext.device); > - > - cq->monitor = false; > - > -// hvnd_cancel_io(&cq->ep_object); > - ep_stop(&cq->ep_object); > - > - hvnd_deinit_ep(&cq->ep_object); > - > - hvnd_db_unmap_user(uctx, (u64)cq->db_addr); > - ib_umem_release(cq->umem); > - > - hvnd_destroy_cq(nd_dev, cq); > - > - hvnd_drop_uctx_ref(nd_dev, uctx); > - kfree(cq); > - > - return 0; > -} > - > -static int hvnd_resize_cq(struct ib_cq *cq, int cqe, struct ib_udata *udata) > -{ > - /* > - * NDDirect does not support resizing CQ. > - */ > - hvnd_info("check code\n"); > - return -ENOSYS; > -} > - > -static int hvnd_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc) > -{ > - hvnd_info("check code\n"); > - return 0; > -} > - > -static struct ib_mr *hvnd_get_dma_mr(struct ib_pd *pd, int acc) > -{ > - hvnd_info("check code\n"); > - return NULL; > -} > - > -static void debug_dump_umem(struct ib_umem *umem) > -{ > -#ifdef HVND_MEM_DEBUG > - struct ib_umem_chunk *chunk; > - struct scatterlist *sg; > - int len, j, entry; > - int shift = ffs(umem->page_size) - 1; > - > - hvnd_debug("umem=%p\n", umem); > - hvnd_debug("context=%p length=%lu offset=%d page_size=%d writable=%d hugetlb=%d\n", > - umem->context, > - umem->length, > - umem->offset, > - umem->page_size, > - umem->writable, > - umem->hugetlb); > - > - list_for_each_entry(chunk, &umem->chunk_list, list) { > - hvnd_debug("chunk->nmap=%d\n", chunk->nmap); > - for (j = 0; j < chunk->nmap; ++j) { > - sg = &chunk->page_list[j]; > - hvnd_debug("sg_dma_len=%d sg_dma_address=%llx\n", > - sg_dma_len(sg), sg_dma_address(sg)); > - hvnd_debug("page_link=%lx offset=%u length=%u\n", > - sg->page_link, sg->offset, sg->length); > - len = sg_dma_len(&chunk->page_list[j]) >> shift; > - for_each_sg(&chunk->page_list[j], sg, len, entry) { > - hvnd_debug("PFN=%lu\n", > - page_to_pfn(sg_page(sg))); > - } > - } > - } > -#endif > -} > - > - > -static struct ib_mr *hvnd_reg_user_mr(struct ib_pd *pd, u64 start, u64 length, > - u64 virt, int acc, struct ib_udata *udata) > -{ > - int err = 0; > - struct hvnd_ib_pd *hvndpd = to_nd_pd(pd); > - struct hvnd_mr *mr; > - > - mr = kmalloc(sizeof(*mr), GFP_KERNEL); > - if (!mr) > - return ERR_PTR(-ENOMEM); > - mr->pd = hvndpd; > - > - mr->umem = ib_umem_get(pd->uobject->context, start, length, acc, 0); > - if (IS_ERR(mr->umem)) { > - err = PTR_ERR(mr->umem); > - hvnd_error("ib_umem_get failed ret=%d\n", err); > - kfree(mr); > - return ERR_PTR(err); > - } > - > - debug_dump_umem(mr->umem); > - > - mr->start = start; > - mr->length = length; > - mr->virt = virt; > - mr->acc = acc; > - > - hvnd_debug("start=%llx length=%llx virt=%llx acc=%d\n", > - start, length, virt, acc); > - > - /* > - * First create a memory region. > - */ > - err = hvnd_cr_mr(to_nd_dev(pd->device), > - to_nd_context(pd->uobject->context), hvndpd->handle, > - &mr->mr_handle); > - if (err) { > - hvnd_error("cr_mr failed; ret is %d\n", err); > - goto err; > - } > - > - err = hvnd_mr_register(to_nd_dev(pd->device), > - to_nd_context(pd->uobject->context), mr); > - > - if (err) > - goto err0; > - > - hvnd_acquire_uctx_ref(to_nd_context(pd->uobject->context)); > - > - return &mr->ibmr; > - > -err0: > - hvnd_free_mr(to_nd_dev(pd->device), > - to_nd_context(pd->uobject->context), mr->mr_handle); > -err: > - ib_umem_release(mr->umem); > - kfree(mr); > - return ERR_PTR(err); > -} > - > - > - > -static int hvnd_dereg_mr(struct ib_mr *ib_mr) > -{ > - int ret; > - struct hvnd_mr *mr = to_nd_mr(ib_mr); > - struct hvnd_ucontext *uctx = to_nd_context(ib_mr->pd->uobject->context); > - struct hvnd_dev *nd_dev = to_nd_dev(ib_mr->device); > - > - > - hvnd_debug("dereg_mr entering\n"); > - > - ret = hvnd_deregister_mr(nd_dev, uctx, mr->mr_handle); > - > - if (ret) { > - hvnd_error("hvnd_deregister_mr() failed: %x\n", ret); > - return ret; > - } > - /* > - * Now free up the memory region. > - */ > - > - ret = hvnd_free_mr(nd_dev, uctx, mr->mr_handle); > - if (ret) { > - hvnd_error("hvnd_free_mr() failed: %x\n", ret); > - return ret; > - } > - > - ib_umem_release(mr->umem); > - > - hvnd_drop_uctx_ref(nd_dev, uctx); > - kfree(mr); > - > - hvnd_debug("dereg_mr done\n"); > - return 0; > -} > - > -static struct ib_mw *hvnd_alloc_mw(struct ib_pd *pd, enum ib_mw_type type, > - struct ib_udata *udata) > -{ > - hvnd_info("check code\n"); > - return NULL; > -} > - > -static int hvnd_dealloc_mw(struct ib_mw *mw) > -{ > - debug_check(__func__, __LINE__); > - return 0; > -} > - > -static int hvnd_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags) > -{ > - struct hvnd_ucontext *uctx; > - struct hvnd_dev *nd_dev; > - struct hvnd_cq *cq; > - > - cq = to_nd_cq(ibcq); > - uctx = cq->uctx; > - nd_dev = to_nd_dev(uctx->ibucontext.device); > - > - > - debug_check(__func__, __LINE__); > - > - return 0; > -} > - > -static int hvnd_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, > - struct ib_send_wr **bad_wr) > -{ > - debug_check(__func__, __LINE__); > - return 0; > -} > - > -int hvnd_post_receive(struct ib_qp *ibqp, struct ib_recv_wr *wr, > - struct ib_recv_wr **bad_wr) > -{ > - debug_check(__func__, __LINE__); > - return 0; > -} > - > -static int hvnd_resolve_addr(struct sockaddr_in *laddr, > - struct sockaddr_in *raddr, > - struct if_physical_addr *phys_addrstruct) > -{ > - int ret; > - > - phys_addrstruct->length = ETH_ALEN; > - ret = hvnd_get_neigh_mac_addr((struct sockaddr *)laddr, > - (struct sockaddr *)raddr, > - phys_addrstruct->addr); > - > - hvnd_debug("Dest MAC is %pM\n", phys_addrstruct->addr); > - return ret; > -} > - > -static int hvnd_connect(struct iw_cm_id *cm_id, > - struct iw_cm_conn_param *conn_param) > -{ > - int ret = 0; > - struct hvnd_dev *nd_dev; > - struct hvnd_ep_obj *ep_object; > - struct sockaddr_in *raddr = (struct sockaddr_in *)&cm_id->remote_addr; > - struct sockaddr_in *laddr = (struct sockaddr_in *)&cm_id->local_addr; > - struct hvnd_qp *qp; > - struct if_physical_addr phys_addrstruct; > - union nd_sockaddr_inet dest_addr; > - u64 connector_handle; > - union nd_sockaddr_inet addr; > - char addr_buf[50]; > - > - if (cm_id->remote_addr.ss_family != AF_INET) { > - hvnd_error("cm_id->remote_addr.ss_family=%d not AF_INET\n", > - cm_id->remote_addr.ss_family); > - return -ENOSYS; > - } > - > - qp = get_qpp(to_nd_dev(cm_id->device), conn_param->qpn); > - > - if (!qp) { > - hvnd_error("failed to find qp conn_param->qpn=%d\n", > - conn_param->qpn); > - return -EINVAL; > - } > - > - cm_id->provider_data = qp; > - cm_id->add_ref(cm_id); > - qp->cm_id = cm_id; > - > - /* > - * Set the read/write limits. > - * Can we change the limits on a created QP? Luke? > - */ > - nd_dev = to_nd_dev(cm_id->device); > - ep_object = hvnd_setup_ep(cm_id, ND_CONNECTOR, nd_dev, qp->uctx); > - hvnd_debug("active connection: local irp is %d\n", > - ep_object->local_irp); > - if (!ep_object) { > - hvnd_error("hvnd_setup_ep failure\n"); > - ret = -ENOMEM; > - goto err_limit; > - } > - > - ret = hvnd_cr_connector(nd_dev, qp->uctx, > - &connector_handle); > - > - if (ret) { > - hvnd_error("hvnd_cr_connector failure ret=%d\n", ret); > - goto err_cr_connector; > - } > - > - hvnd_acquire_uctx_ref(qp->uctx); > - ep_object->ep_handle = connector_handle; > - ep_object->incoming = false; > - qp->connector = ep_object; > - > - /* > - * Bind the local address to the connector. > - */ > - hvnd_debug("Connect local address is %s\n", > - debug_inet_ntoa(laddr->sin_addr, addr_buf)); > - > - memcpy(&addr.ipv4, laddr, sizeof(struct sockaddr_in)); > - hvnd_debug("CONNECT AF %d port %d addr %s\n", > - addr.ipv4.sin_family, addr.ipv4.sin_port, > - debug_inet_ntoa(addr.ipv4.sin_addr, addr_buf)); > - > - ret = hvnd_bind_connector(nd_dev, qp->uctx, > - connector_handle, > - &addr); > - > - if (ret) { > - hvnd_error("hvnd_bind_connector failed ret=%d\n", ret); > - goto err_bind_connector; > - } > - > - ret = hvnd_resolve_addr(laddr, raddr, &phys_addrstruct); > - if (ret) { > - hvnd_error("hvnd_resolve_addr failed ret=%d\n", ret); > - goto err_bind_connector; > - } > - > - memcpy(&dest_addr.ipv4, raddr, sizeof(struct sockaddr_in)); > - > - > - > - /* > - * Now attempt to connect. > - */ > - > - hvnd_debug("About to initiate connection\n"); > - > - if (!ep_add_work_pending(ep_object)) > - goto err_bind_connector; > - > - ep_object->cm_state = hvnd_cm_connect_received; > - ret = hvnd_connector_connect(nd_dev, qp->uctx, > - ep_object->ep_handle, > - conn_param->ird, conn_param->ord, > - conn_param->private_data_len, > - (u8 *)conn_param->private_data, > - qp->qp_handle, > - &phys_addrstruct, &dest_addr, > - ep_object); > - > - if (ret == 0) > - return 0; > - > - ep_object->cm_state = hvnd_cm_idle; > - ep_del_work_pending(ep_object); > - hvnd_error("hvnd_connector_connect failed ret=%d\n", ret); > - > -err_bind_connector: > - qp->connector = NULL; > - hvnd_free_connector(nd_dev, qp->uctx, > - connector_handle); > - hvnd_drop_uctx_ref(nd_dev, qp->uctx); > - > -err_cr_connector: > - kfree(ep_object); > - > -err_limit: > - cm_id->provider_data = NULL; > - qp->cm_id = NULL; > - cm_id->rem_ref(cm_id); > - return ret; > -} > - > -static int hvnd_accept_cr(struct iw_cm_id *cm_id, > - struct iw_cm_conn_param *conn_param) > -{ > - int ret = 0; > - struct hvnd_dev *nd_dev; > - struct hvnd_qp *qp; > - struct hvnd_ep_obj *connector; > - enum ibv_qp_state new_qp_state; > - > - hvnd_debug("Accepting connection - PASSIVE\n"); > - nd_dev = to_nd_dev(cm_id->device); > - qp = get_qpp(to_nd_dev(cm_id->device), conn_param->qpn); > - > - if (!qp) { > - hvnd_error("get_qpp failed conn_param->qpn=%d\n", > - conn_param->qpn); > - return -EINVAL; > - } > - > - > - connector = (struct hvnd_ep_obj *)cm_id->provider_data; > - qp->connector = connector; > - > - if (connector == NULL) { > - hvnd_error("NULL connector!\n"); > - return -EINVAL; > - } > - hvnd_debug("connector's cm_id is %p caller cm_id=%p\n", > - connector->cm_id, cm_id); > - > - connector->cq = qp->recv_cq; > - > - > - /* > - * Setup state for the accepted connection. > - */ > - cm_id->add_ref(cm_id); > - connector->cm_id = cm_id; > - if (conn_param == NULL) { > - hvnd_error("NULL conn_param!\n"); > - return -EINVAL; > - } > - > - connector->ord = conn_param->ord; > - connector->ird = conn_param->ird; > - > - if (!ep_add_work_pending(connector)) > - goto error; > - > - init_completion(&connector->connector_accept_event); > - > - ret = hvnd_connector_accept(nd_dev, qp->uctx, connector->ep_handle, > - qp->qp_handle, conn_param->ird, > - conn_param->ord, > - conn_param->private_data_len, > - conn_param->private_data, > - &new_qp_state, connector); > - > - if (ret) { > - ep_del_work_pending(connector); > - hvnd_error("connector accept failed\n"); > - goto error; > - } > - > - wait_for_completion(&connector->connector_accept_event); > - ret = connector->connector_accept_status; > - > - if (ret) { > - hvnd_error("connector_accept failed status=%x\n", ret); > - ret = -EIO; > - goto error; > - } > - > - hvnd_debug("Passive Connection Accepted; new qp state is %d\n", > - new_qp_state); > - connector->cm_state = hvnd_cm_accept_sent; > - return 0; > - > -error: > - ep_stop(connector); > - connector->cm_id = NULL; > - connector->cm_state = hvnd_cm_idle; > - > - qp->connector = NULL; > - cm_id->rem_ref(cm_id); > - > - return ret; > -} > - > -static int hvnd_reject_cr(struct iw_cm_id *cm_id, const void *pdata, > - u8 pdata_len) > -{ > - debug_check(__func__, __LINE__); > - return 0; > -} > - > -void hvnd_process_disconnect(struct hvnd_ep_obj *ep_object, int status) > -{ > - struct iw_cm_event cm_event; > - > - switch (status) { > - case STATUS_SUCCESS: > - case STATUS_CANCELLED: > - break; > - > - default: > - hvnd_warn("disconnect complete failed: status:%d\n", status); > - } > - > - > - hvnd_debug("active disconnect processed\n"); > - memset(&cm_event, 0, sizeof(cm_event)); > - > - complete(&ep_object->disconnect_event); > -} > - > - > -void hvnd_process_notify_disconnect(struct hvnd_ep_obj *ep_object, int status) > -{ > - struct iw_cm_event cm_event; > - > - // make sure we only disconnect once > - if (atomic_xchg(&ep_object->disconnect_notified, 1)) > - return; > - > - /* > - * Turn off CQ monitoring. > - */ > - if (ep_object->cq) > - ep_object->cq->monitor = false; > - > - switch (ep_object->cm_state) { > - case hvnd_cm_connect_reply_sent: > - case hvnd_cm_established_sent: > - case hvnd_cm_accept_sent: > - break; > - > - default: > - hvnd_error("unexpected cm_state=%d\n", ep_object->cm_state); > - return; > - } > - > - switch (status) { > - case STATUS_SUCCESS: > - case STATUS_CANCELLED: > - case STATUS_DISCONNECTED: > - break; > - > - default: > - hvnd_warn("notify disconnect complete failed: status:%d\n", > - status); > - } > - > - hvnd_debug("passive disconnect notified\n"); > - memset(&cm_event, 0, sizeof(cm_event)); > - > - /* > - * Other end disconnected. > - * Connection has been disconnected; > - * notify the cm layer. > - */ > - cm_event.status = -ECONNRESET; > - cm_event.event = IW_CM_EVENT_CLOSE; > - > - if ((ep_object->cm_id) && > - (ep_object->cm_id->event_handler)) { > - > - ep_object->cm_id->event_handler(ep_object->cm_id, &cm_event); > - > - ep_object->cm_id->rem_ref(ep_object->cm_id); > - ep_object->cm_state = hvnd_cm_close_sent; > - } > -} > - > -void hvnd_process_connector_accept(struct hvnd_ep_obj *ep_object, int status) > -{ > - struct iw_cm_event cm_event; > - int ret; > - > - /* > - * This is the problem area the return status may be: > - * 1: 0xc00000b5 (3221225653) - {Device Timeout} The specified I/O > - * operation on %hs was not completed before the time-out period > - * expired > - * 2: NTSTATUS 0xc0000241 (3221226049) - The transport connection was > - * aborted by the local system. > - * If we do nothing here, iwcm will wait for IW_CM_EVENT_ESTABLISHED > - * forever, and unable to clean shutdown need to fail the call earlier > - * on accept. > - */ > - > - ep_object->connector_accept_status = status; > - > - if (status) { > - hvnd_error("Connector accept failed; status is %x\n", status); > - complete(&ep_object->connector_accept_event); > - return; > - } > - > - memset(&cm_event, 0, sizeof(cm_event)); > - cm_event.event = IW_CM_EVENT_ESTABLISHED; > - cm_event.ird = ep_object->ird; > - cm_event.ord = ep_object->ord; > - cm_event.provider_data = (void *)ep_object; > - > - /* > - * We have successfully passively accepted the > - * incoming connection. > - */ > - > - hvnd_debug("Passive connection accepted!!\n"); > - if ((ep_object->cm_id) && > - (ep_object->cm_id->event_handler)) { > - ep_object->cm_id->event_handler(ep_object->cm_id, &cm_event); > - ep_object->cm_state = hvnd_cm_established_sent; > - } > - > - complete(&ep_object->connector_accept_event); > - > - /* > - * Request notification if the other end > - * were to disconnect. > - */ > - if (!ep_add_work_pending(ep_object)) > - return; > - > - ret = hvnd_connector_notify_disconnect(ep_object->nd_dev, > - ep_object->uctx, > - ep_object->ep_handle, > - ep_object); > - > - if (ret) { > - ep_del_work_pending(ep_object); > - hvnd_error("Connector notify disconnect failed; ret: %d\n", > - ret); > - } > -} > - > - > -void hvnd_process_cq_event_pending(struct hvnd_ep_obj *ep_object, > - int status) > -{ > - > - struct ib_cq *ibcq; > - struct hvnd_cq *cq; > - > - cq = ep_object->cq; > - ibcq = &ep_object->cq->ibcq; > - > - if (!cq->monitor) > - return; > - > - // call the previous CQ complete > - if ((status == STATUS_PENDING) && cq->upcall_pending && > - ibcq->comp_handler) { > - ibcq->comp_handler(ibcq, ibcq->cq_context); > - cq->upcall_pending = false; > - hvnd_debug("CQ comp_handler called arm_sn=%d\n", cq->arm_sn); > - } > - > - if ((status != STATUS_PENDING) && ibcq->comp_handler && > - ibcq->cq_context) { > - ibcq->comp_handler(ibcq, ibcq->cq_context); > - hvnd_error("CQ comp_handler called status=%x\n", status); > - } > -} > - > -void hvnd_process_cq_event_complete(struct hvnd_ep_obj *ep_object, > - int status) > -{ > - struct ib_cq *ibcq; > - struct hvnd_cq *cq; > - int ret; > - > - cq = ep_object->cq; > - ibcq = &ep_object->cq->ibcq; > - > - // call hte previous CQ complete > - if (cq->upcall_pending && ibcq->comp_handler) { > - ibcq->comp_handler(ibcq, ibcq->cq_context); > - cq->upcall_pending = false; > - hvnd_debug("CQ comp_handler called arm_sn=%d\n", cq->arm_sn); > - } > - > - cq->upcall_pending = true; > - if (!ep_add_work_pending(ep_object)) > - return; > - > - ret = hvnd_notify_cq(ep_object->nd_dev, > - ep_object->cq, > - ND_CQ_NOTIFY_ANY, > - (u64)ep_object); > - > - if (ret) { > - ep_del_work_pending(ep_object); > -// hvnd_manage_io_state(ep_object, true); > - hvnd_error("hvnd_notify_cq failed ret=%d\n", ret); > - } > - > - if ((status != 0) && (status != STATUS_CANCELLED)) { > - if (ibcq->event_handler) { > - struct ib_event event; > - > - event.device = ibcq->device; > - event.event = IB_EVENT_CQ_ERR; > - event.element.cq = ibcq; > - ibcq->event_handler(&event, ibcq->cq_context); > - > - hvnd_warn("CQ event_handler called status=%x\n", > - status); > - } > - } > -} > - > -int init_cm_event(struct hvnd_ep_obj *ep_object, struct iw_cm_event *cm_event, > - int event) > -{ > - struct sockaddr_in *laddr = (struct sockaddr_in *)&cm_event->local_addr; > - struct sockaddr_in *raddr = (struct sockaddr_in *)&cm_event->remote_addr; > - struct nd_read_limits rd_limits; > - union nd_sockaddr_inet local_addr; > - union nd_sockaddr_inet remote_addr; > - int ret; > - > - /* > - * Now get the local address. > - */ > - ret = hvnd_connector_get_local_addr(ep_object->nd_dev, > - ep_object->uctx, > - ep_object->ep_handle, > - &local_addr); > - > - if (ret) { > - hvnd_error("Connector get addr failed; ret: %d\n", ret); > - return ret; > - } > - /* > - * Now get the remote address. > - */ > - ret = hvnd_connector_get_peer_addr(ep_object->nd_dev, > - ep_object->uctx, > - ep_object->ep_handle, > - &remote_addr); > - > - if (ret) { > - hvnd_error("Connector get peer addr failed; ret: %d\n", ret); > - return ret; > - } > - > - /* > - * Get other connection parameters. > - */ > - > - ret = hvnd_connector_get_rd_limits(ep_object->nd_dev, > - ep_object->uctx, > - ep_object->ep_handle, > - &rd_limits); > - > - if (ret) { > - hvnd_error("Connector rd limits failed; ret: %d\n", ret); > - return ret; > - } > - > - /* > - * XXXKYS: Luke: What about the length of the priv data? > - */ > - ret = hvnd_connector_get_priv_data(ep_object->nd_dev, > - ep_object->uctx, > - ep_object->ep_handle, > - ep_object->priv_data); > - > - if (ret) { > - hvnd_error("Connector get priv data failed; ret: %d\n", ret); > - return ret; > - } > - /* > - * Initialize CM structure. > - */ > - laddr->sin_addr.s_addr = local_addr.ipv4.sin_addr.s_addr; > - hvnd_debug("Local addr is %d\n", laddr->sin_addr.s_addr); > - laddr->sin_port = local_addr.ipv4.sin_port; > - laddr->sin_family = AF_INET; > - > - raddr->sin_addr.s_addr = remote_addr.ipv4.sin_addr.s_addr; > - hvnd_debug("Remote addr is %d\n", raddr->sin_addr.s_addr); > - raddr->sin_port = remote_addr.ipv4.sin_port; > - raddr->sin_family = AF_INET; > - > - // KYS; LUke: is it always 148 bytes? > - cm_event->private_data_len = MAX_PRIVATE_DATA_LEN; > - cm_event->private_data = ep_object->priv_data; > - > - cm_event->ird = rd_limits.inbound; > - cm_event->ord = rd_limits.outbound; > - cm_event->event = event; > - > - ep_object->ird = cm_event->ird; > - ep_object->ord = cm_event->ord; > - > - return 0; > -} > - > -static void hvnd_refuse_connection(struct hvnd_ep_obj *connector, int status) > -{ > - struct iw_cm_event cm_event; > - > - memset(&cm_event, 0, sizeof(cm_event)); > - > - cm_event.event = IW_CM_EVENT_CONNECT_REPLY; > - cm_event.status = status; > - > - hvnd_debug("returning status %d on connector %p\n", status, connector); > - > - if (connector->cm_id && connector->cm_id->event_handler) { > - connector->cm_id->event_handler(connector->cm_id, &cm_event); > - connector->cm_id->rem_ref(connector->cm_id); > - connector->cm_state = hvnd_cm_connect_reply_refused; > - } > -} > - > -void hvnd_process_events(struct work_struct *work) > -{ > - struct hvnd_work *wrk; > - struct nd_read_limits rd_limits; > - struct hvnd_ep_obj *ep_object; > - struct hvnd_ep_obj *parent; > - struct iw_cm_event cm_event; > - struct sockaddr_in *laddr = (struct sockaddr_in *)&cm_event.local_addr; > - struct sockaddr_in *raddr = (struct sockaddr_in *)&cm_event.remote_addr; > - struct ndv_packet_hdr_control_1 *ctrl_hdr; > - union nd_sockaddr_inet local_addr; > - union nd_sockaddr_inet remote_addr; > - struct pkt_nd_get_connection_listener *connection_pkt; > - struct iw_cm_id *cm_id = NULL; > - int status; > - int ioctl; > - int ret; > - char priv_data[MAX_PRIVATE_DATA_LEN]; > - enum ibv_qp_state new_qp_state; > - struct incoming_pkt *incoming_pkt; > - unsigned long flags; > - > - memset(&cm_event, 0, sizeof(cm_event)); > - memset(&priv_data, 0, MAX_PRIVATE_DATA_LEN); > - > - > - wrk = container_of(work, struct hvnd_work, work); > - > - /* > - * Now call into the connection manager. > - */ > - ep_object = (struct hvnd_ep_obj *)wrk->callback_arg; > - parent = ep_object->parent; > - > -process_next: > - incoming_pkt = NULL; > - spin_lock_irqsave(&ep_object->incoming_pkt_list_lock, flags); > - if (!list_empty(&ep_object->incoming_pkt_list)) { > - incoming_pkt = list_first_entry(&ep_object->incoming_pkt_list, > - struct incoming_pkt, > - list_entry); > - list_del(&incoming_pkt->list_entry); > - } > - spin_unlock_irqrestore(&ep_object->incoming_pkt_list_lock, flags); > - if (incoming_pkt == NULL) > - return; > - > - ctrl_hdr = (struct ndv_packet_hdr_control_1 *)incoming_pkt->pkt; > - status = ctrl_hdr->io_status; > - ioctl = ctrl_hdr->io_cntrl_code; > - > - hvnd_debug("Process Events IOCTL is: %s; iostatus failure: %x in work queue\n", > - hvnd_get_op_name(ioctl), status); > - > - if (status != 0) { > - bool log_error = true; > - > - if (ioctl == IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT && > - status == STATUS_DISCONNECTED) // expected > - log_error = false; > - > - if (log_error) > - hvnd_warn("Process Events IOCTL is: %s; iostatus failure: %x\n", > - hvnd_get_op_name(ioctl), status); > - } > - > - cm_event.status = status; > - > - switch (ep_object->type) { > - case ND_CONNECTOR: > - switch (ioctl) { > - case IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST: > - > - if (ep_object->parent != NULL) { > - > - // Do nothing with this connection request if > - // listener is stopping > - if (!ep_add_work_pending(ep_object->parent)) > - break; > - > - cm_id = ep_object->parent->cm_id; //Listener > - } > - > - connection_pkt = (struct pkt_nd_get_connection_listener *) ctrl_hdr; > - > - if ((status == 0) || (status == STATUS_CANCELLED)) { > - hvnd_get_incoming_connections(ep_object->parent, > - ep_object->parent->nd_dev, > - ep_object->uctx); > - } > - > - if (status) > - goto get_connection_request_done; > - > - /* > - * Now get the local address. > - */ > - ret = hvnd_connector_get_local_addr(ep_object->nd_dev, > - ep_object->uctx, > - ep_object->ep_handle, > - &local_addr); > - > - if (ret) { > - hvnd_error("Connector get addr failed; ret: %d\n", > - ret); > - goto get_connection_request_done; > - } > - /* > - * Now get the remote address. > - */ > - ret = hvnd_connector_get_peer_addr(ep_object->nd_dev, > - ep_object->uctx, > - ep_object->ep_handle, > - &remote_addr); > - > - if (ret) { > - hvnd_error("Connector get peer addr failed; ret: %d\n", > - ret); > - goto get_connection_request_done; > - } > - /* > - * Get other connection parameters. > - */ > - > - ret = hvnd_connector_get_rd_limits(ep_object->nd_dev, > - ep_object->uctx, > - ep_object->ep_handle, > - &rd_limits); > - > - if (ret) { > - hvnd_error("Connector rd imits failed; ret: %d\n", > - ret); > - goto get_connection_request_done; > - } > - > - /* > - * XXXKYS: Luke: What about the length of the priv data? > - */ > - ret = hvnd_connector_get_priv_data( > - ep_object->nd_dev, ep_object->uctx, > - ep_object->ep_handle, ep_object->priv_data); > - > - if (ret) { > - hvnd_error("Connector get priv data failed; ret: %d\n", > - ret); > - goto get_connection_request_done; > - } > - > - cm_event.event = IW_CM_EVENT_CONNECT_REQUEST; > - cm_event.provider_data = (void *)ep_object; > - > - laddr->sin_addr.s_addr = local_addr.ipv4.sin_addr.s_addr; > - hvnd_debug("Local addr is %d\n", > - laddr->sin_addr.s_addr); > - laddr->sin_port = local_addr.ipv4.sin_port; > - laddr->sin_family = AF_INET; > - > - raddr->sin_addr.s_addr = remote_addr.ipv4.sin_addr.s_addr; > - hvnd_debug("Remote addr is %d\n", > - raddr->sin_addr.s_addr); > - raddr->sin_port = remote_addr.ipv4.sin_port; > - raddr->sin_family = AF_INET; > - > - // KYS; LUke: is it always 148 bytes? > - cm_event.private_data_len = MAX_PRIVATE_DATA_LEN; > - cm_event.private_data = ep_object->priv_data; > - > - cm_event.ird = rd_limits.inbound; > - cm_event.ord = rd_limits.outbound; > - > - ep_object->ird = cm_event.ird; > - ep_object->ord = cm_event.ord; > - > - > - if ((cm_id != NULL) && cm_id->event_handler) { > - cm_id->event_handler(cm_id, &cm_event); > - ep_object->cm_state = hvnd_cm_connect_request_sent; > - } > - > -get_connection_request_done: > - if (ep_object->parent != NULL) > - ep_del_work_pending(ep_object->parent); > - break; > - > - case IOCTL_ND_CONNECTOR_CONNECT: > - > - cm_event.event = IW_CM_EVENT_CONNECT_REPLY; > - if (status == STATUS_TIMEOUT && > - ep_object->connector_connect_retry < 3) { > - //TIMEOUT retry > - > - if (!ep_add_work_pending(ep_object)) > - goto refuse_connection; > - > - hvnd_warn("Connector connect timed out, reconnecting... retry count: %d\n", > - ep_object->connector_connect_retry); > - ep_object->connector_connect_retry++; > - ret = hvnd_send_ioctl_pkt(ep_object->nd_dev, > - &ep_object->connector_connect_pkt.hdr, > - sizeof(ep_object->connector_connect_pkt), > - (u64)&ep_object->connector_connect_pkt); > - > - if (ret) { > - hvnd_error("Connector on time out failed: %d\n", > - ret); > - ep_del_work_pending(ep_object); > - goto refuse_connection; > - } > - break; > - } > - > -refuse_connection: > - if (status) { > - cm_event.status = -ECONNREFUSED; > - if (status == STATUS_TIMEOUT) > - cm_event.status = -ETIMEDOUT; > - > - hvnd_refuse_connection(ep_object, > - cm_event.status); > - break; > - } > - > - hvnd_debug("ACTIVE Connection ACCEPTED\n"); > - ret = init_cm_event(ep_object, &cm_event, > - IW_CM_EVENT_CONNECT_REPLY); > - if (ret) { > - hvnd_error("init_cm_event failed ret=%d\n", > - ret); > - goto process_done; > - } > - > - ret = hvnd_connector_complete_connect(ep_object->nd_dev, > - ep_object->uctx, > - ep_object->ep_handle, > - &new_qp_state); > - if (ret) { > - hvnd_error("connector_complete failed\n"); > - goto process_done; > - } > - > - cm_event.provider_data = (void *)ep_object; > - > - if ((ep_object->cm_id) && > - (ep_object->cm_id->event_handler)) { > - ep_object->cm_id->event_handler( > - ep_object->cm_id, &cm_event); > - ep_object->cm_state = hvnd_cm_connect_reply_sent; > - } > - /* > - * Rquest notification if the other end > - * were to disconnect. > - */ > - if (!ep_add_work_pending(ep_object)) > - goto process_done; > - > - ret = hvnd_connector_notify_disconnect( > - ep_object->nd_dev, ep_object->uctx, > - ep_object->ep_handle, ep_object); > - > - if (ret) { > - ep_del_work_pending(ep_object); > - hvnd_error("Connector notify disconnect failed; ret: %d\n", > - ret); > - } > - > - break; > - > - case IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT: > - hvnd_process_notify_disconnect(ep_object, status); > - break; > - > - > - default: > - hvnd_error("Unknown Connector IOCTL\n"); > - break; > - } > - break; > - default: > - hvnd_error("Unknown endpoint object\n"); > - break; > - } > -process_done: > - kfree(incoming_pkt); > - ep_del_work_pending(ep_object); > - > - goto process_next; > -} > - > - > -static struct hvnd_ep_obj *hvnd_setup_ep(struct iw_cm_id *cm_id, int ep_type, > - struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx) > -{ > - struct hvnd_ep_obj *ep_object; > - int ret; > - > - ep_object = kzalloc(sizeof(struct hvnd_ep_obj), GFP_KERNEL); > - > - if (!ep_object) > - return NULL; > - > - ret = hvnd_init_ep(ep_object, cm_id, ep_type, nd_dev, uctx); > - > - if (ret) { > - hvnd_error("hvnd_init_ep failed ret=%d\n", ret); > - kfree(ep_object); > - return NULL; > - } > - > - return ep_object; > -} > - > -/* > - * return value: > - * true: I/O state is stopped, we should not do upcall > - * flase: I/O state is running and normal > - */ > -#if 0 > -static bool hvnd_manage_io_state(struct hvnd_ep_obj *ep, bool failure) > -{ > - unsigned long flags; > - > - spin_lock_irqsave(&ep->ep_lk, flags); > - if (ep->to_be_destroyed) { > - hvnd_warn("ep being destroyed\n"); > - if (ep->io_outstanding) { > - hvnd_warn("ep being destroyed i/O pending waking up on %p\n", > - &ep->block_event); > - complete(&ep->block_event); > - ep->io_outstanding = false; > - } > - spin_unlock_irqrestore(&ep->ep_lk, flags); > - return true; > - } > - if (!failure) > - ep->io_outstanding = true; > - spin_unlock_irqrestore(&ep->ep_lk, flags); > - return false; > -} > -#endif > - > -static int hvnd_get_incoming_connections(struct hvnd_ep_obj *listener, > - struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx) > -{ > - struct hvnd_ep_obj *connector; > - u64 connector_handle; > - int ret; > - > - /* > - * First handle the protocol for > - * destruction - outstanding I/O. > - */ > - > -// if (hvnd_manage_io_state(listener, false)) > -// return 0; > - /* > - * Create a connector. > - */ > - connector = hvnd_setup_ep(listener->cm_id, ND_CONNECTOR, nd_dev, uctx); > - if (!connector) { > - hvnd_error("hvnd_setup_ep failed\n"); > - ret = -ENOMEM; > - goto con_alloc_err; > - } > - > - ret = hvnd_cr_connector(nd_dev, uctx, > - &connector_handle); > - if (ret) { > - hvnd_error("hvnd_cr_connector failed ret=%d\n", ret); > - goto con_cr_err; > - } > - > - /* > - * Now get a connection if one is pending. > - */ > - connector->ep_handle = connector_handle; > - connector->parent = listener; > - > - if (!ep_add_work_pending(connector)) > - goto get_connection_err; > - > - ret = hvnd_get_connection_listener(nd_dev, uctx, > - listener->ep_handle, > - connector_handle, > - (u64)connector); > - > - if (ret) { > - hvnd_debug("listener_get_connection failed\n"); > - ep_del_work_pending(connector); > - goto get_connection_err; > - } > - > - hvnd_acquire_uctx_ref(uctx); > - listener->outstanding_handle = connector_handle; > - listener->outstanding_ep = connector; > - hvnd_debug("outstanding handle is %p\n", (void *)connector_handle); > - return 0; > - > -get_connection_err: > - hvnd_free_handle(nd_dev, uctx, > - connector_handle, > - IOCTL_ND_CONNECTOR_FREE); > - > -con_cr_err: > - kfree(connector); > -con_alloc_err: > -// hvnd_manage_io_state(listener, true); > - return ret; > -} > - > -static int hvnd_create_listen(struct iw_cm_id *cm_id, int backlog) > -{ > - int ret = 0; > - struct hvnd_dev *nd_dev; > - struct hvnd_ucontext *uctx; > - struct hvnd_ep_obj *ep_object; > - union nd_sockaddr_inet addr; > - union nd_sockaddr_inet local_addr; > - u64 listener_handle; > - struct sockaddr_in *laddr = (struct sockaddr_in *)&cm_id->local_addr; > - union nd_sockaddr_inet og_addr; > - > - > - nd_dev = to_nd_dev(cm_id->device); > - uctx = get_uctx(nd_dev, current_pid()); > - hvnd_debug("uctx is %p; pid is %d\n", uctx, current_pid()); > - > - if (cm_id->local_addr.ss_family != AF_INET) { > - hvnd_error("cm_id->local_addr.ss_family =%d not AF_INET\n", > - cm_id->local_addr.ss_family); > - return -ENOSYS; > - } > - > - /* > - * If the local address is LOOPBACK or INADDR_ANY, get an an address > - * to bind the listener. For now, just get the first address > - * available. > - */ > - > - if (IN_LOOPBACK(ntohl(laddr->sin_addr.s_addr)) || > - (laddr->sin_addr.s_addr == INADDR_ANY)) { > - > - hvnd_debug("need to get an address\n"); > - ret = hvnd_get_outgoing_rdma_addr(nd_dev, uctx, &og_addr); > - > - if (ret) { > - hvnd_error("failed to get the og address\n"); > - return ret; > - } > - > - laddr->sin_addr.s_addr = og_addr.ipv4.sin_addr.s_addr; > - } > - > - cm_id->add_ref(cm_id); > - > - ep_object = hvnd_setup_ep(cm_id, ND_LISTENER, nd_dev, uctx); > - > - if (!ep_object) { > - hvnd_error("hvnd_setup_ep returned NULL\n"); > - goto alloc_err; > - } > - > - ret = hvnd_cr_listener(nd_dev, uctx, > - &listener_handle); > - if (ret) { > - hvnd_error("hvnd_cr_listener failed ret=%d\n", ret); > - goto cr_err; > - } > - > - ep_object->ep_handle = listener_handle; > - > - cm_id->provider_data = ep_object; > - > - /* > - * Now bind the listener. > - * IPV4 support only. > - */ > - memcpy(&addr.ipv4, laddr, sizeof(struct sockaddr_in)); > - > - ret = hvnd_bind_listener(nd_dev, uctx, > - listener_handle, > - &addr); > - if (ret) { > - hvnd_error("hvnd_bind_listener failed ret=%d\n", ret); > - goto bind_err; > - } > - > - /* > - * Now get the local address. > - */ > - ret = hvnd_get_addr_listener(nd_dev, uctx, > - listener_handle, > - &local_addr); > - if (ret) { > - hvnd_error("hvnd_get_addr_listener failed ret=%d\n", ret); > - goto bind_err; > - } > - > - /* > - * Now put the listener in the listen mode. > - */ > - > - ret = hvnd_listen_listener(nd_dev, uctx, > - listener_handle, > - backlog); > - > - if (ret) { > - hvnd_error("hvnd_listen_listener failed ret=%d\n", ret); > - goto bind_err; > - } > - > - > - /* > - * Now get a pending connection if one is pending. > - */ > - ret = hvnd_get_incoming_connections(ep_object, nd_dev, uctx); > - if (ret) { > - hvnd_error("hvnd_get_incoming_connections failed ret=%d\n", > - ret); > - goto bind_err; > - } > - > - hvnd_acquire_uctx_ref(uctx); > - hvnd_debug("cm_id=%p\n", cm_id); > - return 0; > - > -bind_err: > - hvnd_free_handle(nd_dev, uctx, > - listener_handle, > - IOCTL_ND_LISTENER_FREE); > -cr_err: > - kfree(ep_object); > -alloc_err: > - cm_id->provider_data = NULL; > - cm_id->rem_ref(cm_id); > - return ret; > -} > - > -static int hvnd_destroy_listen(struct iw_cm_id *cm_id) > -{ > - struct hvnd_dev *nd_dev; > - struct hvnd_ucontext *uctx; > - struct hvnd_ep_obj *ep_object; > - > - nd_dev = to_nd_dev(cm_id->device); > - > - ep_object = (struct hvnd_ep_obj *)cm_id->provider_data; > - > - hvnd_debug("uctx is %p\n", ep_object->uctx); > - hvnd_debug("Destroying Listener cm_id=%p\n", cm_id); > - uctx = ep_object->uctx; > - > - // make sure there is nothing in progress on this ep > - ep_stop(ep_object); > - > - hvnd_free_handle(nd_dev, uctx, > - ep_object->ep_handle, > - IOCTL_ND_LISTENER_FREE); > - > - /* > - * We may have an ouststanding connector for > - * incoming connection requests; clean it up. > - */ > - > - if (ep_object->outstanding_handle != 0) { > - > - // make sure there is nothing in progress on this ep > - ep_stop(ep_object->outstanding_ep); > - > - hvnd_free_handle(nd_dev, uctx, > - ep_object->outstanding_handle, > - IOCTL_ND_CONNECTOR_FREE); > - > - > - hvnd_drop_uctx_ref(nd_dev, uctx); > - hvnd_destroy_ep(ep_object->outstanding_ep); > - } > - > - /* > - * Now everything should have stopped > - */ > - > - cm_id->rem_ref(cm_id); > - hvnd_destroy_ep(ep_object); > - cm_id->provider_data = NULL; > - hvnd_drop_uctx_ref(nd_dev, uctx); > - > - hvnd_debug("cm_id=%p\n", cm_id); > - return 0; > -} > - > -static void hvnd_qp_add_ref(struct ib_qp *ibqp) > -{ > - struct hvnd_qp *qp = to_nd_qp(ibqp); > - > - atomic_inc(&qp->refcnt); > -} > - > -void hvnd_qp_rem_ref(struct ib_qp *ibqp) > -{ > - struct hvnd_qp *qp = to_nd_qp(ibqp); > - > - if (atomic_dec_and_test(&qp->refcnt)) > - wake_up(&qp->wait); > -} > - > -static DEVICE_ATTR(hw_rev, 0444, hvnd_show_rev, NULL); > -static DEVICE_ATTR(fw_ver, 0444, hvnd_show_fw_ver, NULL); > -static DEVICE_ATTR(hca_type, 0444, hvnd_show_hca, NULL); > -static DEVICE_ATTR(board_id, 0444, hvnd_show_board, NULL); > - > -static struct device_attribute *hvnd_class_attributes[] = { > - &dev_attr_hw_rev, > - &dev_attr_fw_ver, > - &dev_attr_hca_type, > - &dev_attr_board_id, > -}; > - > -int hvnd_register_device(struct hvnd_dev *dev, char *ip_addr, char *mac_addr) > -{ > - int ret; > - > - dev->ibdev.owner = THIS_MODULE; > - dev->device_cap_flags = IB_DEVICE_LOCAL_DMA_LKEY | IB_DEVICE_MEM_WINDOW; > - dev->ibdev.local_dma_lkey = 0; > - dev->ibdev.uverbs_cmd_mask = > - (1ull << IB_USER_VERBS_CMD_GET_CONTEXT) | > - (1ull << IB_USER_VERBS_CMD_QUERY_DEVICE) | > - (1ull << IB_USER_VERBS_CMD_QUERY_PORT) | > - (1ull << IB_USER_VERBS_CMD_ALLOC_PD) | > - (1ull << IB_USER_VERBS_CMD_DEALLOC_PD) | > - (1ull << IB_USER_VERBS_CMD_REG_MR) | > - (1ull << IB_USER_VERBS_CMD_DEREG_MR) | > - (1ull << IB_USER_VERBS_CMD_CREATE_COMP_CHANNEL) | > - (1ull << IB_USER_VERBS_CMD_CREATE_CQ) | > - (1ull << IB_USER_VERBS_CMD_DESTROY_CQ) | > - (1ull << IB_USER_VERBS_CMD_REQ_NOTIFY_CQ) | > - (1ull << IB_USER_VERBS_CMD_CREATE_QP) | > - (1ull << IB_USER_VERBS_CMD_MODIFY_QP) | > - (1ull << IB_USER_VERBS_CMD_QUERY_QP) | > - (1ull << IB_USER_VERBS_CMD_POLL_CQ) | > - (1ull << IB_USER_VERBS_CMD_DESTROY_QP) | > - (1ull << IB_USER_VERBS_CMD_POST_SEND) | > - (1ull << IB_USER_VERBS_CMD_POST_RECV); > - dev->ibdev.node_type = RDMA_NODE_RNIC; > - memcpy(dev->ibdev.node_desc, HVND_NODE_DESC, sizeof(HVND_NODE_DESC)); > - memcpy(&dev->ibdev.node_guid, mac_addr, 6); > - dev->ibdev.phys_port_cnt = 1; //dev->nports; > - dev->ibdev.num_comp_vectors = 1; > - dev->ibdev.query_device = hvnd_query_device; > - dev->ibdev.query_port = hvnd_query_port; > - dev->ibdev.get_link_layer = hvnd_get_link_layer; > - dev->ibdev.query_pkey = hvnd_query_pkey; > - dev->ibdev.query_gid = hvnd_query_gid; > - dev->ibdev.alloc_ucontext = hvnd_alloc_ucontext; > - dev->ibdev.dealloc_ucontext = hvnd_dealloc_ucontext; > - dev->ibdev.mmap = hvnd_mmap; > - dev->ibdev.alloc_pd = hvnd_allocate_pd; > - dev->ibdev.dealloc_pd = hvnd_deallocate_pd; > - dev->ibdev.create_ah = hvnd_ah_create; > - dev->ibdev.destroy_ah = hvnd_ah_destroy; > - dev->ibdev.create_qp = hvnd_ib_create_qp; > - dev->ibdev.modify_qp = hvnd_ib_modify_qp; > - dev->ibdev.query_qp = hvnd_ib_query_qp; > - dev->ibdev.destroy_qp = hvnd_destroy_qp; > - dev->ibdev.create_cq = hvnd_ib_create_cq; > - dev->ibdev.destroy_cq = hvnd_ib_destroy_cq; > - dev->ibdev.resize_cq = hvnd_resize_cq; > - dev->ibdev.poll_cq = hvnd_poll_cq; > - dev->ibdev.get_dma_mr = hvnd_get_dma_mr; > - dev->ibdev.reg_user_mr = hvnd_reg_user_mr; > - dev->ibdev.dereg_mr = hvnd_dereg_mr; > - dev->ibdev.alloc_mw = hvnd_alloc_mw; > - dev->ibdev.dealloc_mw = hvnd_dealloc_mw; > - dev->ibdev.attach_mcast = hvnd_multicast_attach; > - dev->ibdev.detach_mcast = hvnd_multicast_detach; > - dev->ibdev.process_mad = hvnd_process_mad; > - dev->ibdev.req_notify_cq = hvnd_arm_cq; > - dev->ibdev.post_send = hvnd_post_send; > - dev->ibdev.post_recv = hvnd_post_receive; > - dev->ibdev.uverbs_abi_ver = MLX4_IB_UVERBS_ABI_VERSION; > - > - dev->ibdev.get_port_immutable = hvnd_get_port_immutable; > - > - //DMA ops for mapping all possible addresses > - dev->ibdev.dev.parent = &(dev->hvdev->device); > - dev->ibdev.dev.dma_ops = &vmbus_dma_ops; > - dev->ibdev.dev.dma_mask = (u64 *) DMA_BIT_MASK(64); > - dev->ibdev.dev.coherent_dma_mask = DMA_BIT_MASK(64); > - > - dev->ibdev.iwcm = kmalloc(sizeof(struct iw_cm_verbs), GFP_KERNEL); > - if (!dev->ibdev.iwcm) > - return -ENOMEM; > - > - dev->ibdev.iwcm->connect = hvnd_connect; > - dev->ibdev.iwcm->accept = hvnd_accept_cr; > - dev->ibdev.iwcm->reject = hvnd_reject_cr; > - dev->ibdev.iwcm->create_listen = hvnd_create_listen; > - dev->ibdev.iwcm->destroy_listen = hvnd_destroy_listen; > - dev->ibdev.iwcm->add_ref = hvnd_qp_add_ref; > - dev->ibdev.iwcm->rem_ref = hvnd_qp_rem_ref; > - dev->ibdev.iwcm->get_qp = hvnd_get_qp; > - > - strlcpy(dev->ibdev.name, "mlx4_%d", IB_DEVICE_NAME_MAX); > - ret = ib_register_device(&dev->ibdev, NULL); > - if (ret) { > - hvnd_error("ib_register_device failed ret=%d\n", ret); > - goto bail1; > - } > - > -#if 0 > - for (i = 0; i < ARRAY_SIZE(hvnd_class_attributes); ++i) { > - ret = device_create_file(&dev->ibdev.dev, > - hvnd_class_attributes[i]); > - if (ret) { > - hvnd_error("device_create_file failed ret=%d\n", ret); > - goto bail2; > - } > - } > -#endif > - > - dev->ib_active = true; > - return 0; > -//bail2: > -// ib_unregister_device(&dev->ibdev); > -bail1: > - kfree(dev->ibdev.iwcm); > - return ret; > -} > - > -void hvnd_unregister_device(struct hvnd_dev *dev) > -{ > - int i; > - > - for (i = 0; i < ARRAY_SIZE(hvnd_class_attributes); ++i) > - device_remove_file(&dev->ibdev.dev, > - hvnd_class_attributes[i]); > - ib_unregister_device(&dev->ibdev); > - kfree(dev->ibdev.iwcm); > - ib_dealloc_device((struct ib_device *)dev); > -} > - > -static int hvnd_try_bind_nic(unsigned char *mac, __be32 ip) > -{ > - int ret; > - struct hvnd_dev *nd_dev = g_nd_dev; > - > - mutex_lock(&nd_dev->bind_mutex); > - if (nd_dev->bind_complete) { > - mutex_unlock(&nd_dev->bind_mutex); > - return 1; > - } > - > - memcpy(nd_dev->mac_addr, mac, 6); > - *(__be32 *)(nd_dev->ip_addr) = ip; > - > - /* > - * Bind the NIC. > - */ > - hvnd_info("trying to bind to IP %pI4 MAC %pM\n", > - nd_dev->ip_addr, nd_dev->mac_addr); > - ret = hvnd_bind_nic(nd_dev, false, nd_dev->ip_addr, nd_dev->mac_addr); > - if (ret || nd_dev->bind_pkt.pkt_hdr.status) { > - mutex_unlock(&nd_dev->bind_mutex); > - return 1; > - } > - > - /* if we reach here, this means bind_nic is a success */ > - hvnd_error("successfully bound to IP %pI4 MAC %pM\n", > - nd_dev->ip_addr, nd_dev->mac_addr); > - complete(&nd_dev->addr_set); > - nd_dev->bind_complete = 1; > - mutex_unlock(&nd_dev->bind_mutex); > - > - ret = hvnd_register_device(nd_dev, nd_dev->ip_addr, nd_dev->mac_addr); > - > - if (!ret) > - return 0; > - > - hvnd_error("hvnd_register_device failed ret=%d\n", ret); > - > - /* roll back all allocated resources on error */ > - iounmap(nd_dev->mmio_virt); > - release_resource(&nd_dev->mmio_resource); > - > - vmbus_close(nd_dev->hvdev->channel); > - ib_dealloc_device((struct ib_device *)nd_dev); > - > - return 1; > -} > - > -static void hvnd_inetaddr_event_up(unsigned long event, struct in_ifaddr *ifa) > -{ > - hvnd_try_bind_nic(ifa->ifa_dev->dev->dev_addr, ifa->ifa_address); > -} > - > -static int hvnd_inetaddr_event(struct notifier_block *notifier, > - unsigned long event, void *ptr) > -{ > - struct in_ifaddr *ifa = ptr; > - > - switch (event) { > - case NETDEV_UP: > - hvnd_inetaddr_event_up(event, ifa); > - break; > - default: > - hvnd_debug("Received inetaddr event %lu\n", event); > - } > - > - return NOTIFY_DONE; > -} > - > -static struct notifier_block hvnd_inetaddr_notifier = { > - .notifier_call = hvnd_inetaddr_event, > -}; > - > -static int start_bind_nic(void) > -{ > - struct net_device *dev; > - struct in_device *idev; > - struct in_ifaddr *ifa; > - > - register_inetaddr_notifier(&hvnd_inetaddr_notifier); > - > - rtnl_lock(); > - for_each_netdev(&init_net, dev) { > - idev = in_dev_get(dev); > - if (!idev) > - continue; > - for (ifa = (idev)->ifa_list; > - ifa && !(ifa->ifa_flags&IFA_F_SECONDARY); > - ifa = ifa->ifa_next) { > - hvnd_try_bind_nic(dev->dev_addr, ifa->ifa_address); > - } > - } > - rtnl_unlock(); > - > - return 0; > -} > - > -static int hvnd_probe(struct hv_device *dev, > - const struct hv_vmbus_device_id *dev_id) > -{ > - struct hvnd_dev *nd_dev; > - int ret = 0; > - > - hvnd_debug("hvnd starting\n"); > - > - nd_dev = (struct hvnd_dev *)ib_alloc_device(sizeof(struct hvnd_dev)); > - if (!nd_dev) { > - ret = -ENOMEM; > - goto err_out0; > - } > - > - nd_dev->hvdev = dev; > - /* > - * We are going to masquerade as MLX4 device; > - * Set the vendor and device ID accordingly. > - */ > - dev->vendor_id = 0x15b3; //Mellanox > - dev->device_id = 0x1003; //Mellanox HCA > - INIT_LIST_HEAD(&nd_dev->listentry); > - spin_lock_init(&nd_dev->uctxt_lk); > - nd_dev->ib_active = false; > - > - /* > - * Initialize the state for the id table. > - */ > - spin_lock_init(&nd_dev->id_lock); > - idr_init(&nd_dev->cqidr); > - idr_init(&nd_dev->qpidr); > - idr_init(&nd_dev->mmidr); > - idr_init(&nd_dev->irpidr); > - idr_init(&nd_dev->uctxidr); > - > - atomic_set(&nd_dev->open_cnt, 0); > - > - sema_init(&nd_dev->query_pkt_sem, 1); > - > - ret = vmbus_open(dev->channel, HVND_RING_SZ, HVND_RING_SZ, NULL, 0, > - hvnd_callback, dev); > - > - if (ret) { > - hvnd_error("vmbus_open failed ret=%d\n", ret); > - goto err_out1; > - } > - > - hv_set_drvdata(dev, nd_dev); > - > - ret = hvnd_negotiate_version(nd_dev); > - > - if (ret) { > - hvnd_error("hvnd_negotiate_version failed ret=%d\n", ret); > - goto err_out2; > - } > - > - /* > - * Register resources with the host. > - */ > - ret = hvnd_init_resources(nd_dev); > - if (ret) { > - hvnd_error("hvnd_init_resources failed ret=%d\n", ret); > - goto err_out2; > - } > - > - /* > - * Try to bind every NIC to ND channel, > - * ND host will only return success for the correct one > - */ > - nd_dev->bind_complete = 0; > - mutex_init(&nd_dev->bind_mutex); > - init_completion(&nd_dev->addr_set); > - > - g_nd_dev = nd_dev; > - start_bind_nic(); > - > - return 0; > - > -err_out2: > - vmbus_close(dev->channel); > - > -err_out1: > - ib_dealloc_device((struct ib_device *)nd_dev); > - > -err_out0: > - return ret; > -} > - > -static int hvnd_remove(struct hv_device *dev) > -{ > - struct hvnd_dev *nd_dev = hv_get_drvdata(dev); > - > - unregister_inetaddr_notifier(&hvnd_inetaddr_notifier); > - hvnd_bind_nic(nd_dev, true, nd_dev->ip_addr, nd_dev->mac_addr); > - hvnd_unregister_device(nd_dev); > - vmbus_close(dev->channel); > - iounmap(nd_dev->mmio_virt); > - release_resource(&nd_dev->mmio_resource); > - return 0; > -} > - > -static const struct hv_vmbus_device_id id_table[] = { > - /* VMBUS RDMA class guid */ > - /* 8c2eaf3d-32a7-4b09-ab99-bd1f1c86b501 */ > - { HV_ND_GUID, }, > - { }, > -}; > - > -MODULE_DEVICE_TABLE(vmbus, id_table); > - > -static struct hv_driver hvnd_drv = { > - .name = "hv_guest_rdma", > - .id_table = id_table, > - .probe = hvnd_probe, > - .remove = hvnd_remove, > -}; > - > - > -static int __init init_hvnd_drv(void) > -{ > - > - pr_info("Registered HyperV networkDirect Driver\n"); > - hvnd_addr_init(); > - return vmbus_driver_register(&hvnd_drv); > -} > - > -static void exit_hvnd_drv(void) > -{ > - pr_info("De-Registered HyperV networkDirect Driver\n"); > - hvnd_addr_deinit(); > - vmbus_driver_unregister(&hvnd_drv); > -} > - > - > -module_init(init_hvnd_drv); > -module_exit(exit_hvnd_drv); > - > -MODULE_DESCRIPTION("Hyper-V NetworkDirect Driver"); > -MODULE_LICENSE("GPL"); > -MODULE_VERSION("4.1.3"); > diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/user.h b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/user.h > deleted file mode 100644 > index 07e6769ef43b..000000000000 > --- a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/user.h > +++ /dev/null > @@ -1,107 +0,0 @@ > -/* > - * Copyright (c) 2007 Cisco Systems, Inc. All rights reserved. > - * Copyright (c) 2007, 2008 Mellanox Technologies. All rights reserved. > - * > - * This software is available to you under a choice of one of two > - * licenses. You may choose to be licensed under the terms of the GNU > - * General Public License (GPL) Version 2, available from the file > - * COPYING in the main directory of this source tree, or the > - * OpenIB.org BSD license below: > - * > - * Redistribution and use in source and binary forms, with or > - * without modification, are permitted provided that the following > - * conditions are met: > - * > - * - Redistributions of source code must retain the above > - * copyright notice, this list of conditions and the following > - * disclaimer. > - * > - * - Redistributions in binary form must reproduce the above > - * copyright notice, this list of conditions and the following > - * disclaimer in the documentation and/or other materials > - * provided with the distribution. > - * > - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, > - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF > - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND > - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS > - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN > - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN > - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE > - * SOFTWARE. > - */ > - > -#ifndef MLX4_IB_USER_H > -#define MLX4_IB_USER_H > - > -#include <linux/types.h> > - > -/* > - * Increment this value if any changes that break userspace ABI > - * compatibility are made. > - */ > - > -#define MLX4_IB_UVERBS_NO_DEV_CAPS_ABI_VERSION 3 > -#define MLX4_IB_UVERBS_ABI_VERSION 4 > - > -/* > - * Make sure that all structs defined in this file remain laid out so > - * that they pack the same way on 32-bit and 64-bit architectures (to > - * avoid incompatibility between 32-bit userspace and 64-bit kernels). > - * In particular do not use pointer types -- pass pointers in __u64 > - * instead. > - */ > - > -struct mlx4_ib_alloc_ucontext_resp_v3 { > - __u32 qp_tab_size; > - __u16 bf_reg_size; > - __u16 bf_regs_per_page; > -}; > - > -struct mlx4_ib_alloc_ucontext_resp { > - __u32 dev_caps; > - __u32 qp_tab_size; > - __u16 bf_reg_size; > - __u16 bf_regs_per_page; > - __u32 cqe_size; > -}; > - > -struct mlx4_ib_alloc_pd_resp { > - __u32 pdn; > - __u32 reserved; > -}; > - > -struct mlx4_ib_create_cq { > - __u64 buf_addr; > - __u64 db_addr; > -}; > - > -struct mlx4_ib_create_cq_resp { > - __u32 cqn; > - __u32 reserved; > -}; > - > -struct mlx4_ib_resize_cq { > - __u64 buf_addr; > -}; > - > -struct mlx4_ib_create_srq { > - __u64 buf_addr; > - __u64 db_addr; > -}; > - > -struct mlx4_ib_create_srq_resp { > - __u32 srqn; > - __u32 reserved; > -}; > - > -struct mlx4_ib_create_qp { > - __u64 buf_addr; > - __u64 db_addr; > - __u8 log_sq_bb_count; > - __u8 log_sq_stride; > - __u8 sq_no_prefetch; > - __u8 reserved[5]; > -}; > - > -#endif /* MLX4_IB_USER_H */ > diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/vmbus_rdma.c b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/vmbus_rdma.c > deleted file mode 100644 > index daf68dcccb1b..000000000000 > --- a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/vmbus_rdma.c > +++ /dev/null > @@ -1,3074 +0,0 @@ > -/* > - * Copyright (c) 2014, Microsoft Corporation. > - * > - * Author: > - * K. Y. Srinivasan <kys@microsoft.com> > - * > - * This program is free software; you can redistribute it and/or modify it > - * under the terms of the GNU General Public License version 2 as published > - * by the Free Software Foundation. > - * > - * This program is distributed in the hope that it will be useful, but > - * WITHOUT ANY WARRANTY; without even the implied warranty of > - * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or > - * NON INFRINGEMENT. See the GNU General Public License for more > - * details. > - * > - * Bug fixes/enhancements: Long Li <longli@microsoft.com> > - */ > - > -#include <linux/completion.h> > -#include <linux/module.h> > -#include <linux/errno.h> > -#include <linux/hyperv.h> > -#include <linux/efi.h> > -#include <linux/slab.h> > -#include <linux/cred.h> > -#include <linux/sched.h> > -#include <linux/types.h> > -#include <linux/scatterlist.h> > -#include <rdma/ib_umem.h> > -#include <rdma/ib_user_verbs.h> > -#include <asm-generic/delay.h> > - > -#include "vmbus_rdma.h" > - > -/* > - * We only have a single rdma device on the host; > - * have a single receive buffer. > - */ > - > - > -static char hvnd_recv_buffer[PAGE_SIZE * 4]; > - > -static atomic_t irp_local_hdl; > - > -char *hvnd_get_op_name(int ioctl) > -{ > - switch (ioctl) { > - case IOCTL_ND_PROVIDER_INIT: > - return "IOCTL_ND_PROVIDER_INIT\n"; > - case IOCTL_ND_PROVIDER_BIND_FILE: > - return "IOCTL_ND_PROVIDER_BIND_FILE\n"; > - case IOCTL_ND_ADAPTER_OPEN: > - return "IOCTL_ND_ADAPTER_OPEN\n"; > - > - case IOCTL_ND_ADAPTER_CLOSE: > - return "IOCTL_ND_ADAPTER_CLOSE\n"; > - > - case IOCTL_ND_ADAPTER_QUERY: > - return "IOCTL_ND_ADAPTER_QUERY\n"; > - > - case IOCTL_ND_PD_CREATE: > - return "IOCTL_ND_PD_CREATE\n"; > - > - case IOCTL_ND_PD_FREE: > - return "IOCTL_ND_PD_FREE\n"; > - > - case IOCTL_ND_CQ_CREATE: > - return "IOCTL_ND_CQ_CREATE\n"; > - > - case IOCTL_ND_CQ_FREE: > - return "IOCTL_ND_CQ_FREE\n"; > - case IOCTL_ND_CQ_CANCEL_IO: > - return "IOCTL_ND_CQ_CANCEL_IO\n"; > - case IOCTL_ND_CQ_GET_AFFINITY: > - return "IOCTL_ND_CQ_GET_AAFINITY\n"; > - case IOCTL_ND_CQ_MODIFY: > - return "IOCTL_ND_CQ_MODIFY\n"; > - > - case IOCTL_ND_CQ_NOTIFY: > - return "IOCTL_ND_CQ_NOTIFY\n"; > - > - > - case IOCTL_ND_LISTENER_CREATE: > - return "IOCTL_ND_LISTENER_CREATE\n"; > - > - case IOCTL_ND_LISTENER_FREE: > - return "IOCTL_ND_LISTENER_FREE\n"; > - > - case IOCTL_ND_QP_FREE: > - return "IOCTL_ND_QP_FREE\n"; > - > - case IOCTL_ND_CONNECTOR_CANCEL_IO: > - return "IOCTL_ND_CONNECTOR_CANCEL_IO\n"; > - > - case IOCTL_ND_LISTENER_CANCEL_IO: > - return "IOCTL_ND_LISTENER_CANCEL_IO\n"; > - > - case IOCTL_ND_LISTENER_BIND: > - return "IOCTL_ND_LISTENER_BIND\n"; > - > - case IOCTL_ND_LISTENER_LISTEN: > - return "IOCTL_ND_LISTENER_LISTEN\n"; > - > - case IOCTL_ND_LISTENER_GET_ADDRESS: > - return "IOCTL_ND_LISTENER_GET_ADDRESS\n"; > - > - case IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST: > - return "IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST\n"; > - > - > - > - case IOCTL_ND_CONNECTOR_CREATE: > - return "IOCTL_ND_CONNECTOR_CREATE\n"; > - > - case IOCTL_ND_CONNECTOR_FREE: > - return "IOCTL_ND_CONNECTOR_FREE\n"; > - > - case IOCTL_ND_CONNECTOR_BIND: > - return "IOCTL_ND_CONNECTOR_BIND\n"; > - > - // KYS: ALERT: ASYNCH Operation > - case IOCTL_ND_CONNECTOR_CONNECT: > - return "IOCTL_ND_CONNECTOR_CONNECT\n"; > - > - case IOCTL_ND_CONNECTOR_COMPLETE_CONNECT: > - return "IOCTL_ND_CONNECTOR_COMPLETE_CONNECT\n"; > - > - // KYS: ALERT: ASYNCH Operation > - case IOCTL_ND_CONNECTOR_ACCEPT: > - return "IOCTL_ND_CONNECTOR_ACCEPT\n"; > - > - case IOCTL_ND_CONNECTOR_REJECT: > - return "IOCTL_ND_CONNECTOR_REJECT\n"; > - > - case IOCTL_ND_CONNECTOR_GET_READ_LIMITS: > - return "IOCTL_ND_CONNECTOR_GET_READ_LIMITS\n"; > - > - case IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA: > - return "IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA\n"; > - > - case IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS: > - return "IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS\n"; > - > - case IOCTL_ND_CONNECTOR_GET_ADDRESS: > - return "IOCTL_ND_CONNECTOR_GET_ADDRESS\n"; > - > - // KYS: ALERT: ASYNCH Operation > - case IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT: > - return "IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT\n"; > - > - // KYS: ALERT: ASYNCH Operation > - case IOCTL_ND_CONNECTOR_DISCONNECT: > - return "IOCTL_ND_CONNECTOR_DISCONNECT\n"; > - > - > - > - case IOCTL_ND_QP_CREATE: > - return "IOCTL_ND_QP_CREATE\n"; > - > - case IOCTL_ND_MR_CREATE: > - return "IOCTL_ND_MR_CREATE\n"; > - > - case IOCTL_ND_MR_FREE: > - return "IOCTL_ND_MR_FREE\n"; > - case IOCTL_ND_MR_REGISTER: > - return "IOCTL_ND_MR_REGISTER\n"; > - case IOCTL_ND_MR_DEREGISTER: > - return "IOCTL_ND_MR_DEREGISTER\n"; > - case IOCTL_ND_MR_CANCEL_IO: > - return "IOCTL_ND_MR_CANCEL_IO\n"; > - case IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST: > - return "IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST\n"; > - case IOCTL_ND_QP_FLUSH: > - return "IOCTL_ND_QP_FLUSH\n"; > - > - default: > - return "Unknown IOCTL\n"; > - } > -} > -int get_irp_handle(struct hvnd_dev *nd_dev, u32 *local, void *irp_ctx) > -{ > - unsigned int local_handle; > - int ret; > - > - local_handle = atomic_inc_return(&irp_local_hdl); > - *local = local_handle; > - > - /* > - * Now asssociate the local handle with the pointer. > - */ > - ret = insert_handle(nd_dev, &nd_dev->irpidr, irp_ctx, local_handle); > - hvnd_debug("irp_ctx=%p local_handle=%u\n", irp_ctx, local_handle); > - > - if (ret) { > - hvnd_error("insert_handle failed ret=%d\n", ret); > - return ret; > - } > - > - return 0; > -} > - > -void put_irp_handle(struct hvnd_dev *nd_dev, u32 irp) > -{ > - remove_handle(nd_dev, &nd_dev->irpidr, irp); > - > -} > - > -static void init_pfn(u64 *pfn, void *addr, u32 length) > -{ > - int i; > - u32 offset = offset_in_page(addr); > - u32 num_pfn = DIV_ROUND_UP(offset + length, PAGE_SIZE); > - > - for (i = 0; i < num_pfn; i++) > - pfn[i] = virt_to_phys((u8 *)addr + > - (PAGE_SIZE * i)) >> PAGE_SHIFT; > -} > - > - > -static void user_va_init_pfn(u64 *pfn, struct ib_umem *umem) > -{ > - int entry; > - struct scatterlist *sg; > - int i = 0; > - > - for_each_sg(umem->sg_head.sgl, sg, umem->nmap, entry) { > - pfn[i++] = page_to_pfn(sg_page(sg)); > - } > -} > - > -static u32 get_local_handle(void *p) > -{ > - u64 val = (unsigned long)p; > - > - return (u32)val; > -} > - > -static int hvnd_send_pg_buffer(struct hvnd_dev *nd_dev, > - struct vmbus_packet_mpb_array *desc, > - u32 desc_size, > - void *buffer, > - u32 bufferlen, u64 cookie) > -{ > - int ret; > - int t; > - struct hvnd_cookie hvnd_cookie; > - > - hvnd_cookie.pkt = (void *)cookie; > - init_completion(&hvnd_cookie.host_event); > - > - ret = vmbus_sendpacket_mpb_desc(nd_dev->hvdev->channel, > - desc, > - desc_size, > - buffer, bufferlen, > - (u64)(&hvnd_cookie)); > - > - if (ret) { > - hvnd_error("vmbus_sendpacket_mpb_desc failed ret=%d\n", ret); > - goto err; > - } > - > - t = wait_for_completion_timeout(&hvnd_cookie.host_event, 500*HZ); > - > - if (t == 0) { > - hvnd_error("wait_for_completion_timeout timed out\n"); > - ret = -ETIMEDOUT; > - } > - > -err: > - return ret; > -} > - > -static int hvnd_send_packet(struct hvnd_dev *nd_dev, void *buffer, > - u32 bufferlen, u64 cookie, bool block) > -{ > - int ret; > - int t; > - struct hvnd_cookie hvnd_cookie; > - > - hvnd_cookie.pkt = (void *)cookie; > - init_completion(&hvnd_cookie.host_event); > - > - ret = vmbus_sendpacket(nd_dev->hvdev->channel, buffer, bufferlen, > - (u64)(&hvnd_cookie), VM_PKT_DATA_INBAND, > - VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED); > - > - if (ret) { > - hvnd_error("vmbus_send pkt failed: %d\n", ret); > - goto err; > - } > - > - if (!block) > - return ret; > - > - t = wait_for_completion_timeout(&hvnd_cookie.host_event, 500*HZ); > - > - if (t == 0) { > - hvnd_error("wait_for_completion_timeout timed out\n"); > - ret = -ETIMEDOUT; > - } > - > -err: > - return ret; > -} > - > -static int hvnd_send_pgbuf_ioctl_pkt(struct hvnd_dev *nd_dev, > - struct vmbus_packet_mpb_array *desc, > - u32 desc_size, > - struct ndv_packet_hdr_control_1 *hdr, > - u32 pkt_size, u64 cookie) > -{ > - int ret; > - int ioctl; > - > - ioctl = hdr->io_cntrl_code; > - > - > - ret = hvnd_send_pg_buffer(nd_dev, desc, desc_size, > - hdr, pkt_size, cookie); > - > - if (ret) > - return ret; > - > - if (hdr->pkt_hdr.status != 0) { > - hvnd_error("IOCTL: %s failed; status is %x\n", > - hvnd_get_op_name(ioctl), > - hdr->pkt_hdr.status); > - return -EINVAL; > - } > - > - switch (hdr->io_status) { > - case STATUS_SUCCESS: > - case STATUS_PENDING: > - return 0; > - > - default: > - hvnd_error("IOCTL: %s failed io status is %x\n", > - hvnd_get_op_name(ioctl), hdr->io_status); > - return -EINVAL; > - } > -} > - > -int hvnd_send_ioctl_pkt(struct hvnd_dev *nd_dev, > - struct ndv_packet_hdr_control_1 *hdr, > - u32 pkt_size, u64 cookie) > -{ > - int ret; > - int ioctl; > - bool block; > - > - block = (hdr->irp_handle.val64 == 0) ? true : false; > - > - > - ioctl = hdr->io_cntrl_code; > - > - ret = hvnd_send_packet(nd_dev, hdr, pkt_size, cookie, block); > - > - if (ret) > - return ret; > - > - if (!block) > - return ret; > - > - if (hdr->pkt_hdr.status != 0) { > - hvnd_error("IOCTL: %s failed; status is %x\n", > - hvnd_get_op_name(ioctl), hdr->pkt_hdr.status); > - return -EINVAL; > - } > - > - switch (hdr->io_status) { > - case STATUS_SUCCESS: > - case STATUS_PENDING: > - return 0; > - > - default: > - hvnd_warn("IOCTL: %s failed io status is %x\n", > - hvnd_get_op_name(ioctl), hdr->io_status); > - return -EINVAL; > - } > -} > - > -void hvnd_init_hdr(struct ndv_packet_hdr_control_1 *hdr, > - u32 data_sz, u32 local, u32 remote, > - u32 ioctl_code, > - u32 ext_data_sz, u32 ext_data_offset, > - u64 irp_handle) > - > -{ > - int pkt_type; > - > - pkt_type = NDV_PKT_ID1_CONTROL; > - NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); > - hdr->pkt_hdr.packet_type = pkt_type; > - hdr->pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); > - hdr->pkt_hdr.data_sz = data_sz; > - > - hdr->pkt_hdr.status = 0; > - > - hdr->file_handle.local = local; > - hdr->file_handle.remote = remote; > - hdr->irp_handle.val64 = irp_handle; > - > - hdr->io_cntrl_code = ioctl_code; > - hdr->output_buf_sz = data_sz - ext_data_sz; > - hdr->input_buf_sz = data_sz - ext_data_sz; > - > - hdr->input_output_buf_offset = 0; > - > - hdr->extended_data.size = ext_data_sz; > - hdr->extended_data.offset = ext_data_offset; > -} > - > - > -int hvnd_create_file(struct hvnd_dev *nd_dev, void *uctx, > - struct ndv_pkt_hdr_create_1 *create, u32 file_flags) > -{ > - int ret; > - int pkt_type; > - > - > - pkt_type = NDV_PKT_ID1_CREATE; > - NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); > - create->pkt_hdr.packet_type = pkt_type; > - create->pkt_hdr.hdr_sz = sizeof(struct ndv_pkt_hdr_create_1); > - create->pkt_hdr.data_sz = 0; > - > - create->handle.local = get_local_handle(uctx); > - create->access_mask = STANDARD_RIGHTS_ALL; > - create->open_options = OPEN_EXISTING; > - create->file_attributes = FILE_ATTRIBUTE_NORMAL | file_flags; > - create->share_access = FILE_SHARE_ALL; > - > - ret = hvnd_send_packet(nd_dev, create, > - sizeof(struct ndv_pkt_hdr_create_1), > - (unsigned long)create, true); > - return ret; > -} > - > -int hvnd_cleanup_file(struct hvnd_dev *nd_dev, u32 local, u32 remote) > -{ > - int ret; > - int pkt_type; > - struct ndv_pkt_hdr_cleanup_1 cleanup_pkt; > - > - > - pkt_type = NDV_PKT_ID1_CLEANUP; > - NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); > - > - cleanup_pkt.pkt_hdr.packet_type = pkt_type; > - cleanup_pkt.pkt_hdr.hdr_sz = sizeof(struct ndv_pkt_hdr_create_1); > - cleanup_pkt.pkt_hdr.data_sz = 0; > - > - cleanup_pkt.handle.local = local; > - cleanup_pkt.handle.remote = remote; > - > - ret = hvnd_send_packet(nd_dev, &cleanup_pkt, > - sizeof(struct ndv_pkt_hdr_create_1), > - (unsigned long)&cleanup_pkt, true); > - return ret; > -} > - > - > -static int hvnd_do_ioctl(struct hvnd_dev *nd_dev, u32 ioctl, > - struct pkt_nd_provider_ioctl *pkt, > - union ndv_context_handle *hdr_handle, > - struct nd_handle *ioctl_handle, > - u8 *buf, u32 buf_len, bool c_in, bool c_out, u64 irp_val) > -{ > - int ret; > - int pkt_type; > - > - pkt_type = NDV_PKT_ID1_CONTROL; > - NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); > - > - pkt->hdr.pkt_hdr.packet_type = pkt_type; > - pkt->hdr.pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); > - pkt->hdr.pkt_hdr.data_sz = (sizeof(struct pkt_nd_provider_ioctl) - > - sizeof(struct ndv_packet_hdr_control_1)); > - > - pkt->hdr.file_handle.local = hdr_handle->local; > - pkt->hdr.file_handle.remote = hdr_handle->remote; > - hvnd_debug("create handle local: %x remote: %x\n", > - hdr_handle->local, hdr_handle->remote); > - > - pkt->hdr.irp_handle.val64 = irp_val; > - > - pkt->hdr.io_cntrl_code = ioctl; > - pkt->hdr.output_buf_sz = sizeof(struct nd_ioctl); > - pkt->hdr.input_buf_sz = sizeof(struct nd_ioctl); > - pkt->hdr.input_output_buf_offset = 0; > - memset(&pkt->ioctl.handle, 0, sizeof(struct nd_handle)); > - pkt->ioctl.handle.version = ND_VERSION_1; > - > - switch (ioctl) { > - case IOCTL_ND_PROVIDER_BIND_FILE: > - pkt->ioctl.handle.handle = ioctl_handle->handle; > - break; > - default: > - break; > - }; > - > - /* > - * Copy the input buffer, if needed. > - */ > - > - if (c_in && (buf != NULL)) > - memcpy(pkt->ioctl.raw_buffer, buf, buf_len); > - > - ret = hvnd_send_packet(nd_dev, pkt, > - sizeof(struct pkt_nd_provider_ioctl), > - (unsigned long)pkt, true); > - > - if (ret) > - return ret; > - > - if (c_out && (buf != NULL)) > - memcpy(buf, pkt->ioctl.raw_buffer, buf_len); > - > - return ret; > -} > - > -static int idr_callback(int id, void *p, void *data) > -{ > - if (p == data) > - return id; > - return 0; > -} > - > -void remove_uctx(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx) > -{ > - int pid = current_pid(); > - unsigned long flags; > - int id; > - > - if (get_uctx(nd_dev, pid) == uctx) > - remove_handle(nd_dev, &nd_dev->uctxidr, pid); > - else { > - hvnd_warn("uctx %p not found on pid %d, doing a idr search\n", > - uctx, current_pid()); > - > - spin_lock_irqsave(&nd_dev->id_lock, flags); > - id = idr_for_each(&nd_dev->uctxidr, idr_callback, uctx); > - spin_unlock_irqrestore(&nd_dev->id_lock, flags); > - > - if (id) > - remove_handle(nd_dev, &nd_dev->uctxidr, id); > - else { > - hvnd_error("uctx %p not found in idr table\n", uctx); > - return; > - } > - } > - > - kfree(uctx); > -} > - > -int hvnd_close_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx) > -{ > - int ret; > - > - /* > - * First close the adaptor. > - */ > - > - ret = hvnd_free_handle(nd_dev, uctx, > - uctx->adaptor_hdl, > - IOCTL_ND_ADAPTER_CLOSE); > - > - if (ret) > - hvnd_error("Adaptor close failed; ret is %x\n", ret); > - > - /* > - * Now close the two files we created. > - */ > - > - ret = hvnd_cleanup_file(nd_dev, uctx->file_handle_ovl.local, > - uctx->file_handle_ovl.remote); > - > - if (ret) > - hvnd_error("file cleanup failed; ret is %x\n", ret); > - > - ret = hvnd_cleanup_file(nd_dev, uctx->file_handle.local, > - uctx->file_handle.remote); > - > - if (ret) > - hvnd_error("File cleanup failed; ret is %x\n", ret); > - > - /* > - * Remove the uctx from the ID table. > - */ > - remove_uctx(nd_dev, uctx); > - > - return 0; > -} > - > -int hvnd_open_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx) > -{ > - int ret; > - struct pkt_nd_provider_ioctl *pr_init = &uctx->pr_init_pkt; > - int pkt_type; > - struct nd_handle ioctl_handle; > - struct pkt_nd_open_adapter *pr_o_adap = &uctx->o_adap_pkt; > - > - ret = hvnd_create_file(nd_dev, uctx, &uctx->create_pkt, 0); > - if (ret) { > - hvnd_error("hvnd_create_file failed ret=%d\n", ret); > - goto error_cr; > - } > - > - if (uctx->create_pkt.pkt_hdr.status != 0) { > - hvnd_error("create File failed; status is %d\n", > - uctx->create_pkt.pkt_hdr.status); > - ret = -EINVAL; > - goto error_cr; > - } > - > - uctx->file_handle.local = uctx->create_pkt.handle.local; > - uctx->file_handle.remote = uctx->create_pkt.handle.remote; > - hvnd_debug("INITIALIZE PROVIDER\n"); > - /* > - * Now Initialize the Provider. > - */ > - ioctl_handle.handle = 0; > - ret = hvnd_do_ioctl(nd_dev, IOCTL_ND_PROVIDER_INIT, pr_init, > - &uctx->create_pkt.handle, > - &ioctl_handle, NULL, 0, false, false, 0); > - > - if (ret) { > - ret = -EINVAL; > - goto error_pr_init; > - } > - > - if (pr_init->hdr.pkt_hdr.status != 0) { > - hvnd_error("Provider INIT failed; status is %d\n", > - pr_init->hdr.pkt_hdr.status); > - ret = -EINVAL; > - goto error_pr_init; > - } > - > - if (pr_init->hdr.io_status != 0) { > - hvnd_error("Provider INIT failed; io status is %d\n", > - pr_init->hdr.io_status); > - ret = -EINVAL; > - goto error_pr_init; > - } > - > - /* > - * Now create the overlap file. > - */ > - > - hvnd_debug("CREATE OVERLAP FILE\n"); > - ret = hvnd_create_file(nd_dev, uctx, &uctx->create_pkt_ovl, > - FILE_FLAG_OVERLAPPED); > - if (ret) { > - hvnd_error("hvnd_create_file failed ret=%d\n", ret); > - goto error_pr_init; > - } > - > - if (uctx->create_pkt_ovl.pkt_hdr.status != 0) { > - hvnd_error("create Overlap File failed; status is %d\n", > - uctx->create_pkt_ovl.pkt_hdr.status); > - ret = -EINVAL; > - goto error_pr_init; > - } > - uctx->file_handle_ovl.local = uctx->create_pkt_ovl.handle.local; > - uctx->file_handle_ovl.remote = uctx->create_pkt_ovl.handle.remote; > - > - /* > - * Now bind the two file handles together. > - */ > - > - hvnd_debug("BIND FILE IOCTL remote handle: %d local handle: %d\n", > - uctx->create_pkt_ovl.handle.remote, > - uctx->create_pkt_ovl.handle.local); > - > - ioctl_handle.handle = uctx->create_pkt_ovl.handle.val64; > - ret = hvnd_do_ioctl(nd_dev, IOCTL_ND_PROVIDER_BIND_FILE, pr_init, > - &uctx->create_pkt.handle, > - &ioctl_handle, NULL, 0, false, false, 0); > - > - if (ret) { > - ret = -EINVAL; > - goto error_file_bind; > - } > - if (pr_init->hdr.pkt_hdr.status != 0) { > - hvnd_error("Provider File bind failed; status is %d\n", > - pr_init->hdr.pkt_hdr.status); > - ret = -EINVAL; > - goto error_file_bind; > - } > - if (pr_init->hdr.io_status != 0) { > - hvnd_error("Provider INIT failed; io status is %d\n", > - pr_init->hdr.io_status); > - ret = -EINVAL; > - goto error_file_bind; > - } > - > - /* > - * Now open the adaptor. > - */ > - > - hvnd_debug("OPENING THE ADAPTOR\n"); > - > - pkt_type = NDV_PKT_ID1_CONTROL; > - NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); > - pr_o_adap->hdr.pkt_hdr.packet_type = pkt_type; > - pr_o_adap->hdr.pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); > - pr_o_adap->hdr.pkt_hdr.data_sz = sizeof(struct pkt_nd_open_adapter) - > - sizeof(struct ndv_packet_hdr_control_1); > - > - pr_o_adap->hdr.pkt_hdr.status = 0; > - > - hvnd_debug("hdr sz is %d\n", pr_o_adap->hdr.pkt_hdr.hdr_sz); > - hvnd_debug("data sz is %d\n", pr_o_adap->hdr.pkt_hdr.data_sz); > - > - pr_o_adap->hdr.file_handle.local = uctx->create_pkt.handle.local; > - pr_o_adap->hdr.file_handle.remote = uctx->create_pkt.handle.remote; > - hvnd_debug("create handle local is %x\n", > - uctx->create_pkt.handle.local); > - hvnd_debug("create handle remote is %x\n", > - uctx->create_pkt.handle.remote); > - pr_o_adap->hdr.irp_handle.val64 = 0; > - > - pr_o_adap->hdr.io_cntrl_code = IOCTL_ND_ADAPTER_OPEN; > - pr_o_adap->hdr.output_buf_sz = pr_o_adap->hdr.pkt_hdr.data_sz - > - sizeof(struct extended_data_oad); > - pr_o_adap->hdr.input_buf_sz = pr_o_adap->hdr.pkt_hdr.data_sz - > - sizeof(struct extended_data_oad); > - > - hvnd_debug("output buf sz is %d\n", pr_o_adap->hdr.output_buf_sz); > - hvnd_debug("input buf sz is %d\n", pr_o_adap->hdr.input_buf_sz); > - hvnd_debug("packet size is %d\n", > - (int)sizeof(struct pkt_nd_open_adapter)); > - > - pr_o_adap->hdr.input_output_buf_offset = 0; > - > - > - pr_o_adap->hdr.extended_data.size = sizeof(struct extended_data_oad); > - pr_o_adap->hdr.extended_data.offset = > - offsetof(struct pkt_nd_open_adapter, ext_data) - > - sizeof(struct ndv_packet_hdr_control_1); > - > - hvnd_debug("size of the extended data size: %d\n", > - (int)sizeof(struct extended_data_oad)); > - hvnd_debug("offset of extended data: %d\n", > - pr_o_adap->hdr.extended_data.offset); > - > - /* > - * Now fill out the ioctl section. > - */ > - > - pr_o_adap->ioctl.input.version = ND_VERSION_1; > - pr_o_adap->ioctl.input.ce_mapping_cnt = > - RTL_NUMBER_OF(pr_o_adap->mappings.ctx_input.mappings); > - > - hvnd_debug("ce_mapping cnt is %d\n", > - pr_o_adap->ioctl.input.ce_mapping_cnt); > - > - pr_o_adap->ioctl.input.cb_mapping_offset = sizeof(union oad_ioctl); > - hvnd_debug("cb_mapping offset is %d\n", > - pr_o_adap->ioctl.input.cb_mapping_offset); > - pr_o_adap->ioctl.input.adapter_id = (u64)nd_dev; > - > - pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_UAR].map_type = ND_MAP_IOSPACE; > - pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_UAR].map_io_space.cache_type = ND_NON_CACHED; > - pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_UAR].map_io_space.cb_length = 4096; > - > - pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_BF].map_type = ND_MAP_IOSPACE; > - pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_BF].map_io_space.cache_type = ND_WRITE_COMBINED; > - pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_BF].map_io_space.cb_length = 4096; > - > - /* > - * Fill in the extended data. > - */ > - pr_o_adap->ext_data.cnt = IBV_GET_CONTEXT_MAPPING_MAX; > - > - ret = hvnd_send_packet(nd_dev, pr_o_adap, > - sizeof(struct pkt_nd_open_adapter), > - (unsigned long)pr_o_adap, true); > - if (ret) { > - ret = -EINVAL; > - goto error_file_bind; > - } > - > - if (pr_o_adap->hdr.pkt_hdr.status != 0) { > - hvnd_error("Open adaptor failed; status is %d\n", > - pr_o_adap->hdr.pkt_hdr.status); > - ret = -EINVAL; > - goto error_file_bind; > - } > - > - if (pr_o_adap->hdr.io_status != 0) { > - hvnd_error("Open adaptor failed;io status is %d\n", > - pr_o_adap->hdr.io_status); > - ret = -EINVAL; > - goto error_file_bind; > - } > - > - /* > - * Copy the necessary response from the host. > - */ > - > - uctx->adaptor_hdl = pr_o_adap->ioctl.resrc_desc.handle; > - > - > - hvnd_debug("adaptor handle: %p\n", (void *)uctx->adaptor_hdl); > - > - uctx->uar_base = > - pr_o_adap->mappings.ctx_output.mapping_results[IBV_GET_CONTEXT_UAR].info; > - hvnd_debug("uar base: %p\n", (void *)uctx->uar_base); > - > - uctx->bf_base = > - pr_o_adap->mappings.ctx_output.mapping_results[IBV_GET_CONTEXT_BF].info; > - hvnd_debug("bf base: %p\n", (void *)uctx->bf_base); > - > - uctx->bf_buf_size = > - pr_o_adap->mappings.ctx_output.bf_buf_size; > - hvnd_debug("bf buf size: %d\n", uctx->bf_buf_size); > - > - uctx->bf_offset = > - pr_o_adap->mappings.ctx_output.bf_offset; > - hvnd_debug("bf offset: %d\n", uctx->bf_offset); > - > - uctx->cqe_size = > - pr_o_adap->mappings.ctx_output.cqe_size; > - hvnd_debug("cqe size: %d\n", uctx->cqe_size); > - > - uctx->max_qp_wr = > - pr_o_adap->mappings.ctx_output.max_qp_wr; > - hvnd_debug("max qp wr: %d\n", uctx->max_qp_wr); > - > - uctx->max_sge = > - pr_o_adap->mappings.ctx_output.max_sge; > - hvnd_debug("max sge: %d\n", uctx->max_sge); > - > - uctx->max_cqe = > - pr_o_adap->mappings.ctx_output.max_cqe; > - hvnd_debug("max cqe: %d\n", uctx->max_cqe); > - > - uctx->num_qps = > - pr_o_adap->mappings.ctx_output.qp_tab_size; > - hvnd_debug("num qps: %d\n", uctx->num_qps); > - > - /* > - * Now query the adaptor and stash away the adaptor info. > - */ > - > - ret = hvnd_query_adaptor(nd_dev, uctx); > - if (ret) { > - hvnd_error("Query Adaptor failed; ret is %d\n", ret); > - goto query_err; > - } > - > - return ret; > - > -query_err: > - hvnd_free_handle(nd_dev, uctx, > - uctx->adaptor_hdl, > - IOCTL_ND_ADAPTER_CLOSE); > - > - hvnd_error("Open Adaptor Failed!!\n"); > - > -error_file_bind: > - hvnd_cleanup_file(nd_dev, uctx->file_handle_ovl.local, > - uctx->file_handle_ovl.remote); > - > -error_pr_init: > - hvnd_cleanup_file(nd_dev, uctx->file_handle.local, > - uctx->file_handle.remote); > - > -error_cr: > - if (get_uctx(nd_dev, current_pid()) != NULL) > - remove_handle(nd_dev, &nd_dev->uctxidr, current_pid()); > - > - return ret; > -} > - > -int hvnd_create_cq(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - struct hvnd_cq *cq) > -{ > - int ret; > - struct pkt_nd_create_cq *pkt; > - int num_pfn, num_db_pfn, num_sn_pfn; > - int cq_pkt_size; > - unsigned int cq_buf_size, offset; > - u32 ext_data_sz; > - u32 ext_data_offset; > - > - /* > - * Now create CQ. > - * First compute the number of PFNs we need to accomodate: > - * One each for door bell and arm_sn and pages in cq buffer. > - */ > - cq_buf_size = (cq->entries * uctx->cqe_size); > - offset = offset_in_page(cq->cq_buf); > - num_pfn = DIV_ROUND_UP(offset + cq_buf_size, PAGE_SIZE); > - > - offset = offset_in_page(cq->db_addr); > - num_db_pfn = DIV_ROUND_UP(offset + 8, PAGE_SIZE); > - > - offset = offset_in_page(&cq->arm_sn); > - num_sn_pfn = DIV_ROUND_UP(offset + 4, PAGE_SIZE); > - > - cq_pkt_size = sizeof(struct pkt_nd_create_cq) + > - (num_pfn * sizeof(u64)); > - > - ext_data_sz = sizeof(struct create_cq_ext_data) + > - (num_pfn * sizeof(u64)); > - ext_data_offset = offsetof(struct pkt_nd_create_cq, ext_data) - > - sizeof(struct ndv_packet_hdr_control_1); > - > - hvnd_debug("CREATE CQ, num user addr pfns is %d\n", num_pfn); > - hvnd_debug("CREATE CQ, num db pfns is %d\n", num_db_pfn); > - > - pkt = kzalloc(cq_pkt_size, GFP_KERNEL); > - > - if (!pkt) > - return -ENOMEM; > - > - hvnd_init_hdr(&pkt->hdr, > - (cq_pkt_size - > - sizeof(struct ndv_packet_hdr_control_1)), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_CQ_CREATE, > - ext_data_sz, > - ext_data_offset, > - 0); > - > - /* > - * Now fill out the ioctl section. > - */ > - > - pkt->ioctl.input.version = ND_VERSION_1; > - pkt->ioctl.input.queue_depth = cq->entries; > - pkt->ioctl.input.ce_mapping_cnt = MLX4_IB_CREATE_CQ_MAPPING_MAX; > - pkt->ioctl.input.cb_mapping_offset = sizeof(union create_cq_ioctl); > - > - hvnd_debug("ce_mapping cnt is %d\n", pkt->ioctl.input.ce_mapping_cnt); > - hvnd_debug("cb_mapping offset is %d\n", > - pkt->ioctl.input.cb_mapping_offset); > - > - pkt->ioctl.input.adapter_handle = uctx->adaptor_hdl; > - pkt->ioctl.input.affinity.mask = 0; > - pkt->ioctl.input.affinity.group = -1; > - > - // 0 for usermode CQ arming > - pkt->mappings.cq_in.flags = 0; > - > - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_BUF].map_memory.map_type = ND_MAP_MEMORY; > - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_BUF].map_memory.access_type = ND_MODIFY_ACCESS; > - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_BUF].map_memory.address = (u64)cq->cq_buf; > - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_BUF].map_memory.cb_length = (cq->entries * uctx->cqe_size); > - > - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_DB].map_memory.map_type = ND_MAP_MEMORY_COALLESCE; > - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_DB].map_memory.access_type = ND_WRITE_ACCESS; > - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_DB].map_memory.address = (u64)cq->db_addr; > - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_DB].map_memory.cb_length = 8; //size of two ints > - > - > - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_ARM_SN].map_memory.map_type = ND_MAP_MEMORY; > - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_ARM_SN].map_memory.access_type = ND_MODIFY_ACCESS; > - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_ARM_SN].map_memory.address = (u64)&cq->arm_sn; > - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_ARM_SN].map_memory.cb_length = 4; //size of one int > - /* > - * Fill in the extended data. > - */ > - > - pkt->ext_data.cnt = 3; > - pkt->ext_data.fields[MLX4_IB_CREATE_CQ_BUF].size = (sizeof(struct gpa_range) + (num_pfn * sizeof(u64))); > - pkt->ext_data.fields[MLX4_IB_CREATE_CQ_BUF].offset = offsetof(struct create_cq_ext_data, cqbuf_gpa); > - > - pkt->ext_data.fields[MLX4_IB_CREATE_CQ_DB].size = sizeof(struct cq_db_gpa); > - pkt->ext_data.fields[MLX4_IB_CREATE_CQ_DB].offset = offsetof(struct create_cq_ext_data, db_gpa); > - > - pkt->ext_data.fields[MLX4_IB_CREATE_CQ_ARM_SN].size = sizeof(struct cq_db_gpa); > - pkt->ext_data.fields[MLX4_IB_CREATE_CQ_ARM_SN].offset = offsetof(struct create_cq_ext_data, sn_gpa); > - > - /* > - * Fill up the gpa range for cq buffer. > - */ > - > - pkt->ext_data.db_gpa.byte_count = 8; > - pkt->ext_data.db_gpa.byte_offset = offset_in_page(cq->db_addr); > - user_va_init_pfn(&pkt->ext_data.db_gpa.pfn_array[0], cq->db_umem); > - > - pkt->ext_data.sn_gpa.byte_count = 4; > - pkt->ext_data.sn_gpa.byte_offset = offset_in_page(&cq->arm_sn); > - init_pfn(&pkt->ext_data.sn_gpa.pfn_array[0], > - &cq->arm_sn, > - 4); > - > - pkt->ext_data.cqbuf_gpa.byte_count = (cq->entries * uctx->cqe_size); > - pkt->ext_data.cqbuf_gpa.byte_offset = offset_in_page(cq->cq_buf); > - user_va_init_pfn(&pkt->ext_data.cqbuf_gpa.pfn_array[0], cq->umem); > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt->hdr, cq_pkt_size, (u64)pkt); > - > - if (ret) > - goto cr_cq_err; > - > - /* > - * Copy the necessary response from the host. > - */ > - cq->cqn = pkt->mappings.cq_resp.cqn; > - cq->cqe = pkt->mappings.cq_resp.cqe; > - cq->cq_handle = pkt->ioctl.resrc_desc.handle; > - > - ret = insert_handle(nd_dev, &nd_dev->cqidr, cq, cq->cqn); > - > - if (ret) > - goto cr_cq_err; > - hvnd_debug("CQ create after success cqn is %d\n", cq->cqn); > - hvnd_debug("CQ create after success cqe is %d\n", cq->cqe); > - hvnd_debug("CQ create after success cq handle is %p\n", > - (void *)cq->cq_handle); > - > -cr_cq_err: > - kfree(pkt); > - return ret; > -} > - > -int hvnd_destroy_cq(struct hvnd_dev *nd_dev, struct hvnd_cq *cq) > -{ > - struct pkt_nd_free_cq free_cq_pkt; > - int ret; > - > - // KYS try to avoid having to zero everything > - memset(&free_cq_pkt, 0, sizeof(free_cq_pkt)); > - hvnd_init_hdr(&free_cq_pkt.hdr, > - sizeof(struct pkt_nd_free_cq) - > - sizeof(struct ndv_packet_hdr_control_1), > - cq->uctx->create_pkt.handle.local, > - cq->uctx->create_pkt.handle.remote, > - IOCTL_ND_CQ_FREE, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - free_cq_pkt.ioctl.in.version = ND_VERSION_1; > - free_cq_pkt.ioctl.in.handle = cq->cq_handle; > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &free_cq_pkt.hdr, > - sizeof(struct pkt_nd_free_cq), > - (u64)&free_cq_pkt); > - > - if (ret) > - goto free_cq_err; > - > - remove_handle(nd_dev, &nd_dev->cqidr, cq->cqn); > - > - return 0; > - > -free_cq_err: > - return ret; > -} > - > - > -int hvnd_notify_cq(struct hvnd_dev *nd_dev, struct hvnd_cq *cq, > - u32 notify_type, u64 irp_handle) > -{ > - struct pkt_nd_notify_cq notify_cq_pkt; > - int ret; > - union ndv_context_handle irp_fhandle; > - > - irp_fhandle.local = cq->ep_object.local_irp; > - > - // KYS try to avoid having to zero everything > - memset(¬ify_cq_pkt, 0, sizeof(notify_cq_pkt)); > - hvnd_init_hdr(¬ify_cq_pkt.hdr, > - sizeof(struct pkt_nd_notify_cq) - > - sizeof(struct ndv_packet_hdr_control_1), > - cq->uctx->create_pkt.handle.local, > - cq->uctx->create_pkt.handle.remote, > - IOCTL_ND_CQ_NOTIFY, 0, 0, irp_fhandle.val64); > - > - /* > - * Now fill in the ioctl section. > - */ > - notify_cq_pkt.ioctl.in.version = ND_VERSION_1; > - notify_cq_pkt.ioctl.in.cq_handle = cq->cq_handle; > - notify_cq_pkt.ioctl.in.type = notify_type; > - > - > - ret = hvnd_send_ioctl_pkt(nd_dev, ¬ify_cq_pkt.hdr, > - sizeof(struct pkt_nd_notify_cq), > - (u64)¬ify_cq_pkt); > - > - return ret; > -} > - > -/* > - * Memory region operations. > - */ > -int hvnd_cr_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 pd_handle, u64 *mr_handle) > -{ > - struct pkt_nd_create_mr pkt; > - int ret; > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_MR_CREATE, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.version = ND_VERSION_1; > - pkt.ioctl.in.handle = pd_handle; > - hvnd_debug("PD handle is %p\n", (void *)pd_handle); > - pkt.ioctl.in.reserved = 0; > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - /* > - * Copy the handle. > - */ > - hvnd_debug("mr handle is %p\n", (void *)pkt.ioctl.out); > - *mr_handle = pkt.ioctl.out; > - > - return 0; > - > -err: > - hvnd_error("create mr failed: %d\n", ret); > - return ret; > - > -} > - > -int hvnd_free_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 handle) > -{ > - return hvnd_free_handle(nd_dev, uctx, handle, IOCTL_ND_MR_FREE); > -} > - > -int hvnd_deregister_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 handle) > -{ > - struct pkt_nd_deregister_mr pkt; > - int ret; > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_MR_DEREGISTER, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.version = ND_VERSION_1; > - pkt.ioctl.in.handle = handle; > - pkt.ioctl.in.reserved = 0; > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - return 0; > - > -err: > - hvnd_error("de-register mr failed: %d\n", ret); > - return ret; > - > -} > - > -static inline u32 hvnd_convert_access(int acc) > -{ > - return (acc & IB_ACCESS_REMOTE_WRITE ? ND_MR_FLAG_ALLOW_REMOTE_WRITE : 0) | > - (acc & IB_ACCESS_REMOTE_READ ? ND_MR_FLAG_ALLOW_REMOTE_READ : 0) | > - (acc & IB_ACCESS_LOCAL_WRITE ? ND_MR_FLAG_ALLOW_LOCAL_WRITE : 0); > -} > - > - > -int hvnd_mr_register(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - struct hvnd_mr *mr) > -{ > - struct pkt_nd_register_mr pkt; > - int ret; > - struct hv_mpb_array *pb; > - struct vmbus_packet_mpb_array *tpb; > - int sz_leaf; > - int num_pgs; > - int i = 0; > - int ext_data_sz; > - u32 acc_flags; > - u32 desc_size; > - int pkt_type; > - > - /* > - * The user address is passed in via a two level structure. > - * An Array of struct hv_page_buffer will be used to describe > - * the user memory. The pages containing this array will be descibed > - * in another array of struct hv_page_buffer. We pass this seconed level > - * array to the host. > - */ > - > - hvnd_debug("ib_umem_page_count(mr->umem)=%d\n", > - ib_umem_page_count(mr->umem)); > - > - sz_leaf = ib_umem_page_count(mr->umem) * sizeof(u64) + > - sizeof(struct hv_mpb_array); > - > - pb = (struct hv_mpb_array *) __get_free_pages(GFP_KERNEL|__GFP_ZERO, > - get_order(sz_leaf)); > - > - if (pb == NULL) > - return -ENOMEM; > - /* > - * Allocate an array of hv_page_buffer to describe the first level. > - */ > - num_pgs = DIV_ROUND_UP(sz_leaf, PAGE_SIZE); > - hvnd_debug("num pages in the top array is %d\n", num_pgs); > - > - desc_size = (num_pgs * sizeof(u64) + > - sizeof(struct vmbus_packet_mpb_array)); > - tpb = (struct vmbus_packet_mpb_array *) > - __get_free_pages(GFP_KERNEL|__GFP_ZERO, get_order(desc_size)); > - > - if (tpb == NULL) { > - free_pages((unsigned long)pb, get_order(sz_leaf)); > - return -ENOMEM; > - } > - > - hvnd_debug("sz leaf: %d; pgs in top %d\n", sz_leaf, num_pgs); > - > - /* > - * Now fill the leaf level array. > - */ > - pb->len = mr->length; > - pb->offset = offset_in_page(mr->start); > - user_va_init_pfn(pb->pfn_array, mr->umem); > - > - /* > - * Now fill out the top level array. > - */ > - for (i = 0; i < num_pgs; i++) { > - tpb->range.pfn_array[i] = virt_to_phys( > - (u8 *)pb + (PAGE_SIZE * i)) >> PAGE_SHIFT; > - hvnd_debug("virtual address = %p\n", > - (u8 *)pb + (PAGE_SIZE * i)); > - hvnd_debug("physical address = %llx\n", > - virt_to_phys((u8 *)pb + (PAGE_SIZE * i))); > - hvnd_debug("tpb->range.pfn_array[%d]=%llx\n", i, > - tpb->range.pfn_array[i]); > - } > - > - tpb->range.offset = 8; > - tpb->range.len = ib_umem_page_count(mr->umem) * sizeof(u64); > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - ext_data_sz = (ib_umem_page_count(mr->umem) * sizeof(u64)); > - acc_flags = ND_MR_FLAG_DO_NOT_SECURE_VM | hvnd_convert_access(mr->acc); > - hvnd_debug("memory register access flags are: %x\n", acc_flags); > - > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_MR_REGISTER, 0, 0, 0); > - > - /* > - * The memory registration call uses a different mechanism to pass > - * pfn information. > - */ > - > - pkt_type = pkt.hdr.pkt_hdr.packet_type; > - NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTION_EXTERNAL_DATA); > - pkt.hdr.pkt_hdr.packet_type = pkt_type; > - > - pkt.hdr.extended_data.size = ext_data_sz; > - pkt.hdr.extended_data.offset = 0; > - /* > - * Now fill out the ioctl. > - */ > - > - pkt.ioctl.in.header.version = ND_VERSION_1; > - pkt.ioctl.in.header.flags = acc_flags; > - pkt.ioctl.in.header.cb_length = mr->length; > - pkt.ioctl.in.header.target_addr = mr->virt; > - pkt.ioctl.in.header.mr_handle = mr->mr_handle; > - pkt.ioctl.in.address = mr->virt; > - > - /* > - * Now send the packet to the host. > - */ > - > - ret = hvnd_send_pgbuf_ioctl_pkt(nd_dev, > - tpb, desc_size, > - &pkt.hdr, > - sizeof(pkt), > - (unsigned long)&pkt); > - > - if (ret) > - goto err; > - > - hvnd_info("MR REGISTRATION SUCCESS\n"); > - /* > - * Copy the mr registration data. > - */ > - hvnd_debug("mr registration lkey %x\n", pkt.ioctl.out.lkey); > - hvnd_debug("mr registration rkey %x\n", pkt.ioctl.out.rkey); > - > - mr->mr_lkey = pkt.ioctl.out.lkey; > - mr->mr_rkey = pkt.ioctl.out.rkey; > - > - mr->ibmr.lkey = mr->mr_lkey; > - mr->ibmr.rkey = be32_to_cpu(mr->mr_rkey); > - hvnd_debug("ibmr registration lkey %x\n", mr->ibmr.lkey); > - hvnd_debug("ibmr registration rkey %x\n", mr->ibmr.rkey); > - > - free_pages((unsigned long)pb, get_order(sz_leaf)); > - free_pages((unsigned long)tpb, get_order(desc_size)); > - > - return 0; > - > -err: > - free_pages((unsigned long)pb, get_order(sz_leaf)); > - free_pages((unsigned long)tpb, get_order(desc_size)); > - > - hvnd_error("mr register failed: %d\n", ret); > - return ret; > -} > - > -/* > - * Listener operations. > - */ > -int hvnd_cr_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 *listener_handle) > -{ > - struct pkt_nd_cr_listener pkt; > - int ret; > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_LISTENER_CREATE, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.hdr.version = ND_VERSION_1; > - pkt.ioctl.in.hdr.handle = uctx->adaptor_hdl; > - hvnd_debug("Adaptor handle is %p\n", (void *)uctx->adaptor_hdl); > - pkt.ioctl.in.hdr.reserved = 0; > - pkt.ioctl.in.to_semantics = false; > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - /* > - * Copy the listener handle. > - */ > - hvnd_debug("listener handle is %p\n", (void *)pkt.ioctl.out); > - *listener_handle = pkt.ioctl.out; > - > - return 0; > - > -err: > - hvnd_error("create listener failed: ret=%d uctx=%p adaptor handle=%llu\n", > - ret, uctx, uctx->adaptor_hdl); > - return ret; > - > -} > - > -int hvnd_free_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 listener_handle) > -{ > - struct pkt_nd_free_listener pkt; > - int ret; > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_LISTENER_FREE, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.version = ND_VERSION_1; > - pkt.ioctl.in.handle = listener_handle; > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - return 0; > - > -err: > - hvnd_error("free listener failed: %d\n", ret); > - return ret; > -} > - > -int hvnd_bind_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 listener_handle, union nd_sockaddr_inet *addr) > -{ > - struct pkt_nd_bind_listener pkt; > - kuid_t uid = current_uid(); > - int ret; > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_LISTENER_BIND, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.hdr.version = ND_VERSION_1; > - pkt.ioctl.in.hdr.handle = listener_handle; > - pkt.ioctl.in.hdr.reserved = 0; > - > - pkt.ioctl.in.authentication_id = (u32)uid.val; > - pkt.ioctl.in.is_admin = false; > - > - memcpy(&pkt.ioctl.in.hdr.address, addr, sizeof(*addr)); > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - return 0; > - > -err: > - hvnd_error("bind listener failed: %d\n", ret); > - return ret; > -} > - > -int hvnd_listen_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 listener_handle, u32 backlog) > -{ > - struct pkt_nd_listen_listener pkt; > - int ret; > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_LISTENER_LISTEN, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.version = ND_VERSION_1; > - pkt.ioctl.in.listener_handle = listener_handle; > - pkt.ioctl.in.back_log = backlog; > - > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - return 0; > - > -err: > - hvnd_error("listen listener failed: %d\n", ret); > - return ret; > -} > - > -int hvnd_get_addr_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 listener_handle, union nd_sockaddr_inet *addr) > -{ > - struct pkt_nd_get_addr_listener pkt; > - int ret; > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_LISTENER_GET_ADDRESS, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.version = ND_VERSION_1; > - pkt.ioctl.in.handle = listener_handle; > - > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - /* > - * Copy the adddress. > - */ > - > - memcpy(addr, &pkt.ioctl.out, sizeof(union nd_sockaddr_inet)); > - > - return 0; > - > -err: > - hvnd_error("listen listener failed: %d\n", ret); > - return ret; > -} > - > -int hvnd_get_connection_listener(struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx, > - u64 listener_handle, u64 connector_handle, > - u64 irp_handle) > -{ > - struct pkt_nd_get_connection_listener pkt; > - int ret; > - union ndv_context_handle irp_fhandle; > - > - ret = get_irp_handle(nd_dev, &irp_fhandle.local, (void *)irp_handle); > - > - if (ret) { > - hvnd_error("get_irp_handle() failed: err: %d\n", ret); > - return ret; > - } > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST, 0, 0, > - irp_fhandle.val64); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.version = ND_VERSION_1; > - pkt.ioctl.in.listener_handle = listener_handle; > - pkt.ioctl.in.connector_handle = connector_handle; > - > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - return 0; > - > -err: > - hvnd_error("get connection listener failed: %d\n", ret); > - return ret; > -} > - > -/* > - * Connector APIs. > - */ > - > -int hvnd_cr_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 *connector_handle) > -{ > - struct pkt_nd_cr_connector pkt; > - int ret; > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(struct pkt_nd_cr_listener) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_CONNECTOR_CREATE, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.hdr.version = ND_VERSION_1; > - pkt.ioctl.in.hdr.handle = uctx->adaptor_hdl; > - pkt.ioctl.in.to_semantics = false; > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - /* > - * Copy the listener handle. > - */ > - hvnd_debug("connector handle is %p\n", (void *)pkt.ioctl.out); > - *connector_handle = pkt.ioctl.out; > - > - return 0; > - > -err: > - return ret; > -} > - > -int hvnd_free_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 handle) > -{ > - struct pkt_nd_free_connector pkt; > - int ret; > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_CONNECTOR_FREE, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.version = ND_VERSION_1; > - pkt.ioctl.in.handle = handle; > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - return 0; > - > -err: > - return ret; > -} > - > -int hvnd_bind_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 handle, union nd_sockaddr_inet *addr) > -{ > - struct pkt_nd_bind_connector pkt; > - int ret; > - kuid_t uid = current_uid(); > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_CONNECTOR_BIND, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.hdr.version = ND_VERSION_1; > - pkt.ioctl.in.hdr.handle = handle; > - > - memcpy(&pkt.ioctl.in.hdr.address, addr, sizeof(*addr)); > - > - pkt.ioctl.in.authentication_id = (u32)uid.val; > - pkt.ioctl.in.is_admin = false; > - > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - return 0; > - > -err: > - return ret; > -} > - > -int hvnd_connector_connect(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 connector_handle, u32 in_rd_limit, u32 out_rd_limit, > - u32 priv_data_length, const u8 *priv_data, > - u64 qp_handle, struct if_physical_addr *phys_addr, > - union nd_sockaddr_inet *dest_addr, > - struct hvnd_ep_obj *ep) > -{ > - struct pkt_nd_connector_connect *pkt = &ep->connector_connect_pkt; > - int ret; > - union ndv_context_handle irp_fhandle; > - > - hvnd_debug("local irp is %d\n", ep->local_irp); > - irp_fhandle.local = ep->local_irp; > - > - if (priv_data_length > MAX_PRIVATE_DATA_LEN) { > - hvnd_error("priv_data_length=%d\n", priv_data_length); > - return -EINVAL; > - } > - > - // KYS try to avoid having to zero everything > - memset(pkt, 0, sizeof(*pkt)); > - hvnd_init_hdr(&pkt->hdr, > - sizeof(*pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_CONNECTOR_CONNECT, 0, 0, irp_fhandle.val64); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt->ioctl.in.hdr.version = ND_VERSION_1; > - pkt->ioctl.in.hdr.connector_handle = connector_handle; > - pkt->ioctl.in.hdr.read_limits.inbound = in_rd_limit; > - pkt->ioctl.in.hdr.read_limits.outbound = out_rd_limit; > - pkt->ioctl.in.hdr.cb_private_data_length = priv_data_length; > - pkt->ioctl.in.hdr.cb_private_data_offset = > - offsetof(union connector_connect_ioctl, in.priv_data); > - pkt->ioctl.in.hdr.qp_handle = qp_handle; > - > - memcpy(&pkt->ioctl.in.hdr.phys_addr, phys_addr, > - sizeof(struct if_physical_addr)); > - > - /* > - * Luke's code does not copy the ip address. > - */ > - memcpy(&pkt->ioctl.in.hdr.destination_address, dest_addr, > - sizeof(union nd_sockaddr_inet)); > - > - pkt->ioctl.in.retry_cnt = 7; > - pkt->ioctl.in.rnr_retry_cnt = 7; > - memcpy(pkt->ioctl.in.priv_data, priv_data, priv_data_length); > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt->hdr, sizeof(*pkt), (u64)pkt); > - > - if (ret) > - goto err; > - > - return 0; > - > -err: > - return ret; > -} > - > -int hvnd_connector_complete_connect(struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx, > - u64 connector_handle, > - enum ibv_qp_state *qp_state) > -{ > - struct pkt_nd_connector_connect_complete pkt; > - int ret; > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_CONNECTOR_COMPLETE_CONNECT, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.hdr.version = ND_VERSION_1; > - pkt.ioctl.in.hdr.handle = connector_handle; > - pkt.ioctl.in.rnr_nak_to = 0; > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - *qp_state = pkt.ioctl.out.state; > - return 0; > - > -err: > - return ret; > -} > - > -int hvnd_connector_accept(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 connector_handle, > - u64 qp_handle, > - u32 in_rd_limit, u32 out_rd_limit, > - u32 priv_data_length, const u8 *priv_data, > - enum ibv_qp_state *qp_state, struct hvnd_ep_obj *ep) > -{ > - struct pkt_nd_connector_accept pkt; > - int ret; > - union ndv_context_handle irp_fhandle; > - > - irp_fhandle.local = ep->local_irp; > - > - if (priv_data_length > MAX_PRIVATE_DATA_LEN) { > - hvnd_error("priv_data_length=%d\n", priv_data_length); > - return -EINVAL; > - } > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_CONNECTOR_ACCEPT, 0, 0, irp_fhandle.val64); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.hdr.version = ND_VERSION_1; > - pkt.ioctl.in.hdr.reserved = 0; > - pkt.ioctl.in.hdr.read_limits.inbound = in_rd_limit; > - pkt.ioctl.in.hdr.read_limits.outbound = out_rd_limit; > - pkt.ioctl.in.hdr.cb_private_data_length = priv_data_length; > - > - pkt.ioctl.in.hdr.cb_private_data_offset = > - offsetof(struct connector_accept_in, private_data); > - > - pkt.ioctl.in.hdr.connector_handle = connector_handle; > - pkt.ioctl.in.hdr.qp_handle = qp_handle; > - > - pkt.ioctl.in.rnr_nak_to = 0; > - pkt.ioctl.in.rnr_retry_cnt = 7; > - > - > - memcpy(pkt.ioctl.in.private_data, priv_data, priv_data_length); > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - *qp_state = pkt.ioctl.out.state; > - return 0; > - > -err: > - return ret; > -} > - > -int hvnd_connector_reject(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 connector_handle, > - u32 priv_data_length, u8 *priv_data, > - enum ibv_qp_state *qp_state) > -{ > - struct pkt_nd_connector_reject pkt; > - int ret; > - > - if (priv_data_length > MAX_PRIVATE_DATA_LEN) { > - hvnd_error("priv_data_length=%d\n", priv_data_length); > - return -EINVAL; > - } > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_CONNECTOR_REJECT, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.hdr.version = ND_VERSION_1; > - pkt.ioctl.in.hdr.reserved = 0; > - pkt.ioctl.in.hdr.cb_private_data_length = priv_data_length; > - > - pkt.ioctl.in.hdr.cb_private_data_offset = > - offsetof(struct connector_reject_in, private_data); > - > - pkt.ioctl.in.hdr.connector_handle = connector_handle; > - > - memcpy(pkt.ioctl.in.private_data, priv_data, priv_data_length); > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - *qp_state = pkt.ioctl.out.state; > - return 0; > - > -err: > - return ret; > -} > - > -int hvnd_connector_get_rd_limits(struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx, > - u64 connector_handle, > - struct nd_read_limits *rd_limits) > -{ > - struct pkt_nd_connector_get_rd_limits pkt; > - int ret; > - > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_CONNECTOR_GET_READ_LIMITS, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.in.version = ND_VERSION_1; > - pkt.ioctl.in.in.reserved = 0; > - pkt.ioctl.in.in.handle = connector_handle; > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - *rd_limits = pkt.ioctl.out.out; > - return 0; > - > -err: > - return ret; > -} > - > -int hvnd_connector_get_priv_data(struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx, > - u64 connector_handle, > - u8 *priv_data) > -{ > - struct pkt_nd_connector_get_priv_data pkt; > - int ret; > - > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.version = ND_VERSION_1; > - pkt.ioctl.in.reserved = 0; > - pkt.ioctl.in.handle = connector_handle; > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - memcpy(priv_data, pkt.ioctl.out, MAX_PRIVATE_DATA_LEN); > - return 0; > - > -err: > - return ret; > -} > - > -int hvnd_connector_get_peer_addr(struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx, > - u64 connector_handle, > - union nd_sockaddr_inet *peer_addr) > -{ > - struct pkt_nd_connector_get_peer_addr pkt; > - int ret; > - > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.version = ND_VERSION_1; > - pkt.ioctl.in.reserved = 0; > - pkt.ioctl.in.handle = connector_handle; > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - memcpy(peer_addr, &pkt.ioctl.out, sizeof(union nd_sockaddr_inet)); > - return 0; > - > -err: > - return ret; > -} > - > -int hvnd_connector_get_local_addr(struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx, > - u64 connector_handle, > - union nd_sockaddr_inet *addr) > -{ > - struct pkt_nd_connector_get_addr pkt; > - int ret; > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_CONNECTOR_GET_ADDRESS, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.version = ND_VERSION_1; > - pkt.ioctl.in.reserved = 0; > - pkt.ioctl.in.handle = connector_handle; > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - memcpy(addr, &pkt.ioctl.out, sizeof(union nd_sockaddr_inet)); > - return 0; > - > -err: > - return ret; > -} > - > - > -int hvnd_connector_notify_disconnect(struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx, > - u64 connector_handle, struct hvnd_ep_obj *ep) > -{ > - struct pkt_nd_connector_notify_disconnect pkt; > - int ret; > - union ndv_context_handle irp_fhandle; > - > - irp_fhandle .local = ep->local_irp; > - > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT, 0, 0, > - irp_fhandle.val64); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.version = ND_VERSION_1; > - pkt.ioctl.in.reserved = 0; > - pkt.ioctl.in.handle = connector_handle; > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - return 0; > - > -err: > - return ret; > -} > - > - > -//ASYNCH call > -int hvnd_connector_disconnect(struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx, > - u64 connector_handle, struct hvnd_ep_obj *ep) > -{ > - struct pkt_nd_connector_disconnect pkt; > - int ret; > - union ndv_context_handle irp_fhandle; > - > - irp_fhandle.local = ep->local_irp; > - > - > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_CONNECTOR_DISCONNECT, 0, 0, irp_fhandle.val64); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.version = ND_VERSION_1; > - pkt.ioctl.in.reserved = 0; > - pkt.ioctl.in.handle = connector_handle; > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - return 0; > - > -err: > - return ret; > -} > - > -/* > - * QP operations. > - */ > -int hvnd_create_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - struct hvnd_qp *qp) > -{ > - int ret; > - struct pkt_nd_create_qp *pkt; > - int num_pfn, num_db_pfn; > - int qp_pkt_size; > - unsigned int offset; > - u32 ext_data_offset; > - u32 ext_data_size; > - > - /* > - * Now create QP. > - * First compute the number of PFNs we need to accomodate: > - * One each for door bell and arm_sn and pages in cq buffer. > - */ > - offset = offset_in_page(qp->qp_buf); > - num_pfn = DIV_ROUND_UP(offset + qp->buf_size, PAGE_SIZE); > - > - offset = offset_in_page(qp->db_addr); > - num_db_pfn = DIV_ROUND_UP(offset + 4, PAGE_SIZE); > - > - qp_pkt_size = sizeof(struct pkt_nd_create_qp) + > - (num_pfn * sizeof(u64)); > - > - hvnd_debug("CREATE QP, num pfns is %d\n", num_pfn); > - hvnd_debug("CREATE QP, num DB pfns is %d\n", num_db_pfn); > - > - pkt = kzalloc(qp_pkt_size, GFP_KERNEL); > - > - if (!pkt) > - return -ENOMEM; > - > - hvnd_debug("offset of nd_create_qp is %d\n", > - (int)offsetof(struct pkt_nd_create_qp, ioctl.input)); > - > - ext_data_offset = offsetof(struct pkt_nd_create_qp, ext_data) - > - sizeof(struct ndv_packet_hdr_control_1); > - > - ext_data_size = sizeof(struct create_qp_ext_data) + > - (num_pfn * sizeof(u64)); > - > - hvnd_init_hdr(&pkt->hdr, > - qp_pkt_size - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_QP_CREATE, > - ext_data_size, > - ext_data_offset, > - 0); > - > - /* > - * Now fill out the ioctl section. > - */ > - > - pkt->ioctl.input.hdr.version = ND_VERSION_1; > - > - if (qp->max_inline_data > nd_dev->query_pkt.ioctl.ad_info.inline_request_threshold) > - qp->max_inline_data = nd_dev->query_pkt.ioctl.ad_info.inline_request_threshold; > - pkt->ioctl.input.hdr.cb_max_inline_data = qp->max_inline_data; > - > - hvnd_debug("pkt->ioctl.input.hdr.cb_max_inline_data=%d\n", > - pkt->ioctl.input.hdr.cb_max_inline_data); > - > - pkt->ioctl.input.hdr.ce_mapping_cnt = MLX4_IB_CREATE_QP_MAPPINGS_MAX; > - pkt->ioctl.input.hdr.cb_mapping_offset = sizeof(union create_qp_ioctl); > - > - pkt->ioctl.input.hdr.initiator_queue_depth = qp->initiator_q_depth; > - pkt->ioctl.input.hdr.max_initiator_request_sge = > - qp->initiator_request_sge; > - > - hvnd_debug("recv cq handle is %p\n", (void *)qp->receive_cq_handle); > - hvnd_debug("send cq handle is %p\n", (void *)qp->initiator_cq_handle); > - hvnd_debug("pd handle is %p\n", (void *)qp->pd_handle); > - pkt->ioctl.input.hdr.receive_cq_handle = qp->receive_cq_handle; > - pkt->ioctl.input.hdr.initiator_cq_handle = qp->initiator_cq_handle; > - pkt->ioctl.input.hdr.pd_handle = qp->pd_handle; > - > - > - hvnd_debug("ce_mapping cnt is %d\n", > - pkt->ioctl.input.hdr.ce_mapping_cnt); > - hvnd_debug("cb_mapping offset is %d\n", > - pkt->ioctl.input.hdr.cb_mapping_offset); > - > - pkt->ioctl.input.receive_queue_depth = qp->receive_q_depth; > - pkt->ioctl.input.max_receive_request_sge = qp->receive_request_sge; > - > - > - pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_BUF].map_memory.map_type = ND_MAP_MEMORY; > - pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_BUF].map_memory.access_type = ND_MODIFY_ACCESS; > - pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_BUF].map_memory.address = (u64)qp->qp_buf; > - pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_BUF].map_memory.cb_length = qp->buf_size; > - > - pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_DB].map_memory.map_type = ND_MAP_MEMORY_COALLESCE; > - pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_DB].map_memory.access_type = ND_WRITE_ACCESS; > - pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_DB].map_memory.address = (u64)qp->db_addr; > - pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_DB].map_memory.cb_length = 4; > - > - pkt->mappings.qp_in.log_sq_bb_count = qp->log_sq_bb_count; > - pkt->mappings.qp_in.log_sq_stride = qp->log_sq_stride; > - pkt->mappings.qp_in.sq_no_prefetch = qp->sq_no_prefetch; > - > - > - /* > - * Fill in the extended data. > - */ > - > - pkt->ext_data.cnt = 2; > - pkt->ext_data.fields[MLX4_IB_CREATE_QP_BUF].size = > - sizeof(struct gpa_range) + (num_pfn * sizeof(u64)); > - pkt->ext_data.fields[MLX4_IB_CREATE_QP_BUF].offset = > - offsetof(struct create_qp_ext_data, qpbuf_gpa); > - > - pkt->ext_data.fields[MLX4_IB_CREATE_QP_DB].size = > - sizeof(struct qp_db_gpa); > - pkt->ext_data.fields[MLX4_IB_CREATE_QP_DB].offset = > - offsetof(struct create_qp_ext_data, db_gpa); > - > - /* > - * Fill up the gpa range for qp buffer. > - */ > - > - pkt->ext_data.db_gpa.byte_count = 4; // KYS 8 or 16? > - pkt->ext_data.db_gpa.byte_offset = offset_in_page(qp->db_addr); > - user_va_init_pfn(&pkt->ext_data.db_gpa.pfn_array[0], qp->db_umem); > - > - pkt->ext_data.qpbuf_gpa.byte_count = qp->buf_size; > - pkt->ext_data.qpbuf_gpa.byte_offset = offset_in_page(qp->qp_buf); > - user_va_init_pfn(&pkt->ext_data.qpbuf_gpa.pfn_array[0], qp->umem); > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt->hdr, qp_pkt_size, (u64)pkt); > - > - if (ret) > - goto cr_qp_err; > - > - /* > - * Copy the necessary response from the host. > - */ > - qp->qp_handle = pkt->ioctl.resrc_desc.handle; > - > - qp->qpn = pkt->mappings.qp_resp.qpn; > - qp->max_send_wr = pkt->mappings.qp_resp.max_send_wr; > - qp->max_recv_wr = pkt->mappings.qp_resp.max_recv_wr; > - qp->max_send_sge = pkt->mappings.qp_resp.max_send_sge; > - qp->max_recv_sge = pkt->mappings.qp_resp.max_recv_sge; > - > - > - hvnd_debug("qp->max_send_wr=%d max_recv_wr=%d max_send_sge=%d max_recv_sge=%d max_inline_data=%d\n", > - qp->max_send_wr, qp->max_recv_wr, qp->max_send_sge, > - qp->max_recv_sge, qp->max_inline_data); > - > - ret = insert_handle(nd_dev, &nd_dev->qpidr, qp, qp->qpn); > - > - if (ret) > - goto cr_qp_err; > - > - hvnd_debug("QP create after success qpn:%d qp:%p handle:%llu\n", > - qp->qpn, qp, qp->qp_handle); > - > -cr_qp_err: > - kfree(pkt); > - return ret; > -} > - > -int hvnd_free_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - struct hvnd_qp *qp) > -{ > - int ret; > - > - ret = hvnd_free_handle(nd_dev, uctx, qp->qp_handle, IOCTL_ND_QP_FREE); > - if (ret == 0) > - remove_handle(nd_dev, &nd_dev->qpidr, qp->qpn); > - return ret; > -} > - > -int hvnd_flush_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - struct hvnd_qp *qp) > -{ > - struct pkt_nd_flush_qp pkt; > - int ret; > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - IOCTL_ND_QP_FLUSH, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.version = ND_VERSION_1; > - pkt.ioctl.in.reserved = 0; > - pkt.ioctl.in.handle = qp->qp_handle; > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - return 0; > - > -err: > - return ret; > -} > - > - > -int hvnd_bind_nic(struct hvnd_dev *nd_dev, bool un_bind, char *ip_addr, > - char *mac_addr) > -{ > - int ret; > - int pkt_type = NDV_PKT_ID1_BIND; > - > - /* > - * Send the bind information over to the host. > - * For now, we will have a single ip and MAC address that we > - * will deal with. Down the road we will need to expand support > - * for multiple IP and MAC addresses and also deal with changing > - * IP addresses. > - */ > - > - NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); > - hvnd_debug("bind packet type is %d ID:%d\n", > - pkt_type, NDV_PACKET_TYPE_ID(pkt_type)); > - nd_dev->bind_pkt.pkt_hdr.packet_type = pkt_type; > - > - nd_dev->bind_pkt.pkt_hdr.hdr_sz = sizeof(struct ndv_pkt_hdr_bind_1); > - hvnd_debug("bind packet size is %d\n", > - (int)sizeof(struct ndv_pkt_hdr_bind_1)); > - nd_dev->bind_pkt.pkt_hdr.data_sz = 0; > - nd_dev->bind_pkt.unbind = un_bind; > - nd_dev->bind_pkt.ip_address.address_family = AF_INET; > - nd_dev->bind_pkt.ip_address.ipv4.sin_family = AF_INET; > - nd_dev->bind_pkt.ip_address.ipv4.sin_port = 0; > - nd_dev->bind_pkt.ip_address.ipv4.sin_addr.s_addr = *(unsigned int *)ip_addr; > - > - nd_dev->bind_pkt.phys_addr.length = ETH_ALEN; > - memcpy(nd_dev->bind_pkt.phys_addr.addr, mac_addr, ETH_ALEN); > - > - /* > - * This is the adapter handle; needs to be unique for each > - * MAC, ip address tuple. > - */ > - nd_dev->bind_pkt.guest_id = (u64)nd_dev; > - > - ret = hvnd_send_packet(nd_dev, &nd_dev->bind_pkt, > - sizeof(struct ndv_pkt_hdr_bind_1), > - (u64)NULL, > - true); > - return ret; > -} > - > -int hvnd_init_resources(struct hvnd_dev *nd_dev) > -{ > - unsigned long mmio_sz; > - struct resource *resrc; > - int ret = -ENOMEM; > - > - resrc = &iomem_resource; > - > - mmio_sz = nd_dev->hvdev->channel->offermsg.offer.mmio_megabytes * > - 1024 * 1024; > - nd_dev->mmio_sz = mmio_sz; > - nd_dev->mmio_resource.name = KBUILD_MODNAME; > - nd_dev->mmio_resource.flags = IORESOURCE_MEM | IORESOURCE_BUSY; > - > - ret = allocate_resource(resrc, &nd_dev->mmio_resource, > - mmio_sz, 0, -1, mmio_sz, NULL, NULL); > - > - if (ret) { > - hvnd_error("Unable to allocate mmio resources\n"); > - return ret; > - } > - hvnd_debug("MMIO start is %p\n", (void *)nd_dev->mmio_resource.start); > - > - /* > - * Send the mmio information over to the host. > - */ > - nd_dev->resources.pkt_hdr.packet_type = NDV_PKT_ID1_INIT_RESOURCES; > - nd_dev->resources.pkt_hdr.hdr_sz = sizeof(union ndv_packet_hdr); > - nd_dev->resources.pkt_hdr.data_sz = 0; > - > - nd_dev->resources.io_space_sz_mb = mmio_sz; > - nd_dev->resources.io_space_start = nd_dev->mmio_resource.start; > - > - ret = hvnd_send_packet(nd_dev, &nd_dev->resources, > - sizeof(struct ndv_pkt_hdr_init_resources_1), > - (u64)NULL, > - true); > - return ret; > -} > - > -int hvnd_query_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx) > -{ > - struct pkt_nd_query_adaptor *pkt; > - int ret; > - int pkt_type; > - struct ib_device_attr *props = &nd_dev->ibdev.attrs; > - struct adapter_info_v2 *adap_info = &nd_dev->query_pkt.ioctl.ad_info; > - > - hvnd_debug("Performing Adapter query nd_dev=%p\n", nd_dev); > - > - // check if there is a need to do query > - if (nd_dev->query_pkt_set) > - return 0; > - > - // need a lock, multiple process can call this at the same time > - > - down(&nd_dev->query_pkt_sem); > - if (nd_dev->query_pkt_set) { > - up(&nd_dev->query_pkt_sem); > - return 0; > - } > - > - /* > - * Now query the adaptor. > - */ > - > - pkt = &nd_dev->query_pkt; > - > - pkt_type = NDV_PKT_ID1_CONTROL; > - NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); > - > - pkt->hdr.pkt_hdr.packet_type = pkt_type; > - pkt->hdr.pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); > - pkt->hdr.pkt_hdr.data_sz = sizeof(struct pkt_nd_query_adaptor) - > - sizeof(struct ndv_packet_hdr_control_1); > - > - > - pkt->hdr.file_handle.local = uctx->file_handle.local; > - pkt->hdr.file_handle.remote = uctx->file_handle.remote; > - > - pkt->hdr.irp_handle.val64 = 0; > - > - pkt->hdr.io_cntrl_code = IOCTL_ND_ADAPTER_QUERY; > - pkt->hdr.output_buf_sz = sizeof(struct nd_adap_query_ioctl); > - pkt->hdr.input_buf_sz = sizeof(struct nd_adap_query_ioctl); > - pkt->hdr.input_output_buf_offset = 0; > - memset(&pkt->ioctl.ad_q, 0, sizeof(struct nd_adap_query_ioctl)); > - > - pkt->ioctl.ad_q.version = ND_VERSION_1; > - pkt->ioctl.ad_q.info_version = ND_VERSION_2; > - pkt->ioctl.ad_q.adapter_handle = uctx->adaptor_hdl; > - > - ret = hvnd_send_packet(nd_dev, pkt, > - sizeof(struct pkt_nd_query_adaptor), > - (unsigned long)pkt, true); > - > - hvnd_debug("pkt->ioctl.ad_info.inline_request_threshold=%d\n", > - pkt->ioctl.ad_info.inline_request_threshold); > - nd_dev->query_pkt_set = true; > - > - // how about host returning PENDING > - up(&nd_dev->query_pkt_sem); > - > - if (ret) > - return ret; > - > - hvnd_debug("Query Adaptor Succeeded\n"); > - > - /* > - * Cache the relevant properties out. > - */ > - props->fw_ver = 0; > - props->device_cap_flags = 0; > - props->vendor_id = 0x15b3; > - props->vendor_part_id = adap_info->device_id; > - props->max_mr_size = ~0ull; > - props->page_size_cap = PAGE_SIZE; > - props->max_qp = 16384; > - props->max_qp_wr = min(adap_info->max_recv_q_depth, > - adap_info->max_initiator_q_depth); > - props->max_sge = min(adap_info->max_initiator_sge, > - adap_info->max_recv_sge); > - props->max_cq = 0x1FFFF; > - props->max_cqe = adap_info->max_completion_q_depth; > - props->max_mr = 16384; > - props->max_pd = 16384; > - props->max_qp_rd_atom = adap_info->max_inbound_read_limit; > - props->max_qp_init_rd_atom = adap_info->max_outbound_read_limit; > - props->max_res_rd_atom = props->max_qp_rd_atom * props->max_qp; > - props->max_srq = 16384; > - props->max_srq_wr = adap_info->max_recv_q_depth; > - props->max_srq_sge = adap_info->max_recv_sge; > - return 0; > -} > - > - > -int hvnd_create_pd(struct hvnd_ucontext *uctx, struct hvnd_dev *nd_dev, > - struct hvnd_ib_pd *hvnd_pd) > -{ > - struct pkt_nd_pd_create *pkt = &uctx->pd_cr_pkt; > - int ret; > - int pkt_type; > - > - hvnd_debug("Create Protection Domain\n"); > - > - pkt_type = NDV_PKT_ID1_CONTROL; > - NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); > - > - pkt->hdr.pkt_hdr.packet_type = pkt_type; > - pkt->hdr.pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); > - pkt->hdr.pkt_hdr.data_sz = sizeof(struct pkt_nd_pd_create) - > - sizeof(struct ndv_packet_hdr_control_1); > - > - hvnd_debug("pdcreate packet size: %d\n", > - (int)sizeof(struct pkt_nd_pd_create)); > - hvnd_debug("pdcreate hdr size: %d\n", > - (int)sizeof(struct ndv_packet_hdr_control_1)); > - hvnd_debug("pdcreate data size: %d\n", pkt->hdr.pkt_hdr.data_sz); > - > - pkt->hdr.file_handle.local = uctx->create_pkt.handle.local; > - pkt->hdr.file_handle.remote = uctx->create_pkt.handle.remote; > - > - hvnd_debug("create pd uctx is %p\n", uctx); > - hvnd_debug("create pd local file is %d\n", > - uctx->create_pkt.handle.local); > - hvnd_debug("create pd local file is %d\n", > - uctx->create_pkt.handle.remote); > - > - pkt->hdr.irp_handle.val64 = 0; > - pkt->hdr.io_cntrl_code = IOCTL_ND_PD_CREATE; > - > - pkt->hdr.output_buf_sz = sizeof(struct nd_create_pd_ioctl); > - pkt->hdr.input_buf_sz = sizeof(struct nd_create_pd_ioctl); > - pkt->hdr.input_output_buf_offset = 0; > - > - hvnd_debug("output/input buf size: %d\n", pkt->hdr.output_buf_sz); > - > - /* > - * Fill the ioctl section. > - */ > - pkt->ioctl.in.version = ND_VERSION_1; > - pkt->ioctl.in.reserved = 0; > - pkt->ioctl.in.handle = uctx->adaptor_hdl; > - > - > - ret = hvnd_send_packet(nd_dev, pkt, > - sizeof(struct pkt_nd_pd_create), > - (unsigned long)pkt, true); > - > - if (ret) > - return ret; > - > - if (pkt->hdr.pkt_hdr.status != 0) { > - hvnd_error("Create PD failed; status is %d\n", > - pkt->hdr.pkt_hdr.status); > - return -EINVAL; > - } > - if (pkt->hdr.io_status != 0) { > - hvnd_error("Create PD failed;io status is %d\n", > - pkt->hdr.io_status); > - return -EINVAL; > - } > - > - hvnd_debug("Create PD Succeeded\n"); > - > - hvnd_debug("pd_handle is %p\n", (void *)pkt->ioctl.resp.pd_handle); > - hvnd_debug("pdn is %d\n", (int)pkt->ioctl.resp.pdn); > - > - hvnd_pd->pdn = pkt->ioctl.resp.pdn; > - hvnd_pd->handle = pkt->ioctl.out_handle; > - > - return 0; > -} > - > -int hvnd_cancel_io(struct hvnd_ep_obj *ep_object) > -{ > - struct pkt_nd_cancel_io pkt; > - int ret; > - u32 ioctl; > - > - switch (ep_object->type) { > - case ND_LISTENER: > - hvnd_debug("LISTENER I/O Cancelled\n"); > - ioctl = IOCTL_ND_LISTENER_CANCEL_IO; > - break; > - case ND_CONNECTOR: > - hvnd_debug("CONNECTOR I/O Cancelled\n"); > - ioctl = IOCTL_ND_CONNECTOR_CANCEL_IO; > - break; > - case ND_MR: > - hvnd_debug("MR I/O Cancelled\n"); > - ioctl = IOCTL_ND_MR_CANCEL_IO; > - break; > - case ND_CQ: > - hvnd_debug("CQ I/O Cancelled\n"); > - ioctl = IOCTL_ND_CQ_CANCEL_IO; > - break; > - default: > - hvnd_error("UNKNOWN object type\n"); > - return -EINVAL; > - } > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - ep_object->uctx->create_pkt.handle.local, > - ep_object->uctx->create_pkt.handle.remote, > - ioctl, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.version = ND_VERSION_1; > - pkt.ioctl.in.reserved = 0; > - pkt.ioctl.in.handle = ep_object->ep_handle; > - hvnd_debug("cancel io handle is %p\n", (void *)ep_object->ep_handle); > - > - ret = hvnd_send_ioctl_pkt(ep_object->nd_dev, &pkt.hdr, > - sizeof(pkt), > - (u64)&pkt); > - > - if (ret) > - goto err; > - > - /* > - * Now that we have cancelled all I/Os, > - */ > - > - return 0; > - > -err: > - hvnd_error("cancel I/O operation failed\n"); > - return ret; > -} > - > - > -int hvnd_free_handle(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 handle, u32 ioctl) > -{ > - struct pkt_nd_free_handle pkt; > - int ret; > - > - hvnd_debug("Freeing handle ioctl is %s; handle is %p\n", > - hvnd_get_op_name(ioctl), (void *)handle); > - > - hvnd_debug("uctx is %p\n", uctx); > - hvnd_debug("nd_dev is %p\n", nd_dev); > - > - // KYS try to avoid having to zero everything > - memset(&pkt, 0, sizeof(pkt)); > - hvnd_init_hdr(&pkt.hdr, > - sizeof(pkt) - > - sizeof(struct ndv_packet_hdr_control_1), > - uctx->create_pkt.handle.local, > - uctx->create_pkt.handle.remote, > - ioctl, 0, 0, 0); > - > - /* > - * Now fill in the ioctl section. > - */ > - pkt.ioctl.in.version = ND_VERSION_1; > - pkt.ioctl.in.reserved = 0; > - pkt.ioctl.in.handle = handle; > - > - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); > - > - if (ret) > - goto err; > - > - return 0; > - > -err: > - hvnd_error("%s: ret=%d\n", __func__, ret); > - return ret; > -} > - > -int hvnd_negotiate_version(struct hvnd_dev *nd_dev) > -{ > - union ndv_packet_init *pkt = &nd_dev->init_pkt; > - int ret; > - > - nd_dev->negotiated_version = NDV_PROTOCOL_VAERSION_INVALID; > - > - pkt->packet_type = NDV_PACKET_TYPE_INIT; > - pkt->protocol_version = NDV_PROTOCOL_VERSION_CURRENT; > - pkt->flags = 0; // KYS are the flags 0? > - > - ret = hvnd_send_packet(nd_dev, pkt, > - sizeof(union ndv_packet_init), (u64)NULL, true); > - > - return ret; > -} > - > -void hvnd_callback(void *context) > -{ > - struct hv_device *dev = context; > - struct hvnd_dev *nd_dev = hv_get_drvdata(dev); > - int copy_sz = 0; > - struct ndv_packet_hdr_control_1 *ctrl_hdr; > - union ndv_packet_init *pkt_init; > - u32 recvlen; > - u32 local_irp; > - u64 requestid; > - u32 *pkt_type; > - u32 pkt_id; > - struct hvnd_ep_obj *ep_object; > - struct incoming_pkt *incoming_pkt; /* Used only for asynch calls */ > - char *incoming_pkt_start; > - struct vmpacket_descriptor *desc; > - int status; > - struct hvnd_cookie *hvnd_cookie; > - unsigned long flags; > - > - vmbus_recvpacket_raw(dev->channel, hvnd_recv_buffer, > - (PAGE_SIZE * 4), &recvlen, &requestid); > - > - if (recvlen <= 0) > - return; > - > - desc = (struct vmpacket_descriptor *)hvnd_recv_buffer; > - incoming_pkt_start = hvnd_recv_buffer + (desc->offset8 << 3); > - recvlen -= desc->offset8 << 3; > - > - pkt_type = (u32 *)incoming_pkt_start; > - pkt_id = *pkt_type; > - if (pkt_id != NDV_PACKET_TYPE_INIT) > - pkt_id = NDV_PACKET_TYPE_ID(pkt_id); > - > - switch (pkt_id) { > - case NDV_PACKET_TYPE_INIT: > - /* > - * Host is responding to our init packet. > - */ > - pkt_init = (union ndv_packet_init *)incoming_pkt_start; > - nd_dev->negotiated_version = pkt_init->protocol_version; > - copy_sz = 0; > - break; > - > - case NDV_PKT_ID1_INIT_RESOURCES: > - copy_sz = 0; > - break; > - > - case NDV_PKT_ID1_BIND: > - nd_dev->bind_pkt.pkt_hdr.status = > - ((union ndv_packet_hdr *) incoming_pkt_start)->status; > - copy_sz = 0; > - break; > - > - case NDV_PKT_ID1_COMPLETE: > - ctrl_hdr = (struct ndv_packet_hdr_control_1 *)incoming_pkt_start; > - status = ctrl_hdr->io_status; > - > - local_irp = ctrl_hdr->irp_handle.local; > - ep_object = (struct hvnd_ep_obj *)map_irp_to_ctx(nd_dev, > - local_irp); > - > - if (!ep_object) { > - hvnd_error("irp could not be mapped; irp is %d ioctl is %s", > - local_irp, > - hvnd_get_op_name(ctrl_hdr->io_cntrl_code)); > - goto complete; > - } > - > - if (ctrl_hdr->io_cntrl_code != IOCTL_ND_CQ_NOTIFY) > - hvnd_debug("completion packet; iostatus is %x, ioctl is %s", > - ctrl_hdr->io_status, > - hvnd_get_op_name(ctrl_hdr->io_cntrl_code)); > - > - switch (ctrl_hdr->io_cntrl_code) { > - > - case IOCTL_ND_CQ_NOTIFY: > - hvnd_process_cq_event_complete(ep_object, status); > - > - ep_del_work_pending(ep_object); > - goto complete; > - > - case IOCTL_ND_CONNECTOR_ACCEPT: > - > - hvnd_process_connector_accept(ep_object, status); > - > - ep_del_work_pending(ep_object); > - goto complete; > - > - case IOCTL_ND_CONNECTOR_DISCONNECT: > - hvnd_debug("disconnected: ep opj is %p; status: %d\n", > - ep_object, status); > - hvnd_process_disconnect(ep_object, status); > - > - ep_del_work_pending(ep_object); > - goto complete; > - > - default: > - break; > - } > - > - /* > - * This is the completion notification; > - * the IRP cookie is the state through which > - * we will invoke the callback. > - */ > - incoming_pkt = kmalloc(recvlen + sizeof(struct incoming_pkt), > - GFP_ATOMIC); > - if (incoming_pkt == NULL) { > - hvnd_error("Could not alloc memory in callback\n"); > - ep_del_work_pending(ep_object); > - goto complete; > - } > - memcpy(incoming_pkt->pkt, incoming_pkt_start, recvlen); > - > - spin_lock_irqsave(&ep_object->incoming_pkt_list_lock, flags); > - list_add_tail(&incoming_pkt->list_entry, > - &ep_object->incoming_pkt_list); > - spin_unlock_irqrestore(&ep_object->incoming_pkt_list_lock, > - flags); > - > - schedule_work(&ep_object->wrk.work); > - > - goto complete; > - > - case NDV_PKT_ID1_CREATE: > - copy_sz = sizeof(struct ndv_pkt_hdr_create_1); > - break; > - > - case NDV_PKT_ID1_CLEANUP: > - copy_sz = sizeof(struct ndv_pkt_hdr_cleanup_1); > - break; > - > - case NDV_PKT_ID1_CONTROL: > - ctrl_hdr = (struct ndv_packet_hdr_control_1 *)incoming_pkt_start; > - status = ctrl_hdr->io_status; > - > - if (ctrl_hdr->io_cntrl_code != IOCTL_ND_CQ_NOTIFY) > - hvnd_debug("packet; iostatus is %x ioctl is %s", > - ctrl_hdr->io_status, hvnd_get_op_name(ctrl_hdr->io_cntrl_code)); > - > - switch (ctrl_hdr->io_cntrl_code) { > - > - case IOCTL_ND_PROVIDER_INIT: > - copy_sz = sizeof(struct pkt_nd_provider_ioctl); > - break; > - > - case IOCTL_ND_PROVIDER_BIND_FILE: > - copy_sz = sizeof(struct pkt_nd_provider_ioctl); > - break; > - > - case IOCTL_ND_ADAPTER_OPEN: > - copy_sz = sizeof(struct pkt_nd_open_adapter); > - break; > - > - case IOCTL_ND_ADAPTER_CLOSE: > - copy_sz = sizeof(struct pkt_nd_free_handle); > - break; > - > - case IOCTL_ND_ADAPTER_QUERY: > - copy_sz = sizeof(struct pkt_nd_query_adaptor); > - break; > - > - case IOCTL_ND_PD_CREATE: > - copy_sz = sizeof(struct pkt_nd_pd_create); > - break; > - > - case IOCTL_ND_PD_FREE: > - copy_sz = sizeof(struct pkt_nd_free_handle); > - break; > - > - case IOCTL_ND_CQ_CREATE: > - copy_sz = sizeof(struct pkt_nd_create_cq); > - break; > - > - case IOCTL_ND_CQ_FREE: > - copy_sz = sizeof(struct pkt_nd_free_cq); > - break; > - > - case IOCTL_ND_CQ_NOTIFY: //FIXME check ep stop state > - local_irp = ctrl_hdr->irp_handle.local; > - ep_object = (struct hvnd_ep_obj *) > - map_irp_to_ctx(nd_dev, local_irp); > - if (!ep_object) { > - hvnd_error("irp could not be mapped\n"); > - goto complete; > - return; > - } > - copy_sz = sizeof(struct pkt_nd_notify_cq); > - hvnd_process_cq_event_pending(ep_object, status); > - goto complete; > - return; > - > - case IOCTL_ND_LISTENER_CREATE: > - copy_sz = sizeof(struct pkt_nd_cr_listener); > - break; > - > - case IOCTL_ND_LISTENER_FREE: > - copy_sz = sizeof(struct pkt_nd_free_listener); > - break; > - > - case IOCTL_ND_QP_FREE: > - copy_sz = sizeof(struct pkt_nd_free_handle); > - break; > - > - case IOCTL_ND_CONNECTOR_CANCEL_IO: > - case IOCTL_ND_MR_CANCEL_IO: > - case IOCTL_ND_CQ_CANCEL_IO: > - case IOCTL_ND_LISTENER_CANCEL_IO: > - copy_sz = sizeof(struct pkt_nd_cancel_io); > - break; > - > - case IOCTL_ND_LISTENER_BIND: > - copy_sz = sizeof(struct pkt_nd_bind_listener); > - break; > - > - case IOCTL_ND_LISTENER_LISTEN: > - copy_sz = sizeof(struct pkt_nd_listen_listener); > - break; > - > - case IOCTL_ND_LISTENER_GET_ADDRESS: > - copy_sz = sizeof(struct pkt_nd_get_addr_listener); > - break; > - > - case IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST: > - copy_sz = sizeof(struct pkt_nd_get_connection_listener); > - goto complete; // non-block > - > - case IOCTL_ND_CONNECTOR_CREATE: > - copy_sz = sizeof(struct pkt_nd_cr_connector); > - break; > - > - case IOCTL_ND_CONNECTOR_FREE: > - copy_sz = sizeof(struct pkt_nd_free_connector); > - break; > - > - case IOCTL_ND_CONNECTOR_BIND: > - copy_sz = sizeof(struct pkt_nd_free_connector); > - break; > - > - // KYS: ALERT: ASYNCH Operation > - case IOCTL_ND_CONNECTOR_CONNECT: > - copy_sz = sizeof(struct pkt_nd_connector_connect); > - goto complete; //non-block > - > - case IOCTL_ND_CONNECTOR_COMPLETE_CONNECT: > - copy_sz = sizeof(struct pkt_nd_connector_connect_complete); > - break; > - > - // KYS: ALERT: ASYNCH Operation > - case IOCTL_ND_CONNECTOR_ACCEPT: > - copy_sz = sizeof(struct pkt_nd_connector_accept); > - goto complete; //non-block > - > - case IOCTL_ND_CONNECTOR_REJECT: > - copy_sz = sizeof(struct pkt_nd_connector_reject); > - break; > - > - case IOCTL_ND_CONNECTOR_GET_READ_LIMITS: > - copy_sz = sizeof(struct pkt_nd_connector_get_rd_limits); > - break; > - > - case IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA: > - copy_sz = sizeof(struct pkt_nd_connector_get_priv_data); > - break; > - > - case IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS: > - copy_sz = sizeof(struct pkt_nd_connector_get_peer_addr); > - break; > - > - case IOCTL_ND_CONNECTOR_GET_ADDRESS: > - copy_sz = sizeof(struct pkt_nd_connector_get_addr); > - break; > - > - // KYS: ALERT: ASYNCH Operation > - case IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT: > - copy_sz = sizeof(struct pkt_nd_connector_notify_disconnect); > - goto complete; //non-block > - > - // KYS: ALERT: ASYNCH Operation > - case IOCTL_ND_CONNECTOR_DISCONNECT: > - hvnd_debug("IOCTL_ND_CONNECTOR_DISCONNECT\n"); > - copy_sz = sizeof(struct pkt_nd_connector_notify_disconnect); > - goto complete; // non-block > - > - case IOCTL_ND_QP_CREATE: > - copy_sz = sizeof(struct pkt_nd_create_qp); > - break; > - > - case IOCTL_ND_MR_CREATE: > - copy_sz = sizeof(struct pkt_nd_create_mr); > - break; > - > - case IOCTL_ND_MR_FREE: > - copy_sz = sizeof(struct pkt_nd_free_handle); > - break; > - > - case IOCTL_ND_MR_REGISTER: > - copy_sz = sizeof(struct pkt_nd_register_mr); > - break; > - > - case IOCTL_ND_MR_DEREGISTER: > - copy_sz = sizeof(struct pkt_nd_deregister_mr); > - break; > - > - case IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST: > - copy_sz = sizeof(struct pkt_query_addr_list); > - break; > - > - case IOCTL_ND_QP_FLUSH: > - copy_sz = sizeof(struct pkt_nd_flush_qp); > - break; > - > - default: > - hvnd_warn("Got unknown ioctl: %d\n", > - ctrl_hdr->io_cntrl_code); > - copy_sz = 0; > - break; > - } > - > - break; > - default: > - hvnd_warn("Got an unknown packet type %d\n", *pkt_type); > - break; > - } > - > - hvnd_cookie = (struct hvnd_cookie *)requestid; > - memcpy(hvnd_cookie->pkt, incoming_pkt_start, copy_sz); > - complete(&hvnd_cookie->host_event); > - > -complete: > - /* send out ioctl completion patcket */ > - if (desc->flags & VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED) { > - int retry = 5; > - > - while (true) { > - int ret; > - > - ret = vmbus_sendpacket(dev->channel, NULL, 0, > - requestid, VM_PKT_COMP, 0); > - if (ret == 0) { > - break; > - } else if (ret == -EAGAIN) { > - if (--retry == 0) { > - hvnd_error("give up retrying send completion packet\n"); > - break; > - } > - hvnd_warn("retrying send completion packet\n"); > - udelay(100); > - } else { > - hvnd_error("unable to send completion packet ret=%d\n", > - ret); > - break; > - } > - } > - } > - > -} > diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/vmbus_rdma.h b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/vmbus_rdma.h > deleted file mode 100644 > index 78e534fa8748..000000000000 > --- a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/vmbus_rdma.h > +++ /dev/null > @@ -1,2202 +0,0 @@ > -/* > - * Copyright (c) 2014, Microsoft Corporation. > - * > - * Author: > - * K. Y. Srinivasan <kys@microsoft.com> > - * > - * This program is free software; you can redistribute it and/or modify it > - * under the terms of the GNU General Public License version 2 as published > - * by the Free Software Foundation. > - * > - * This program is distributed in the hope that it will be useful, but > - * WITHOUT ANY WARRANTY; without even the implied warranty of > - * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or > - * NON INFRINGEMENT. See the GNU General Public License for more > - * details. > - * > - * Bug fixes/enhancements: Long Li <longli@microsoft.com> > - */ > - > - > -#ifndef _VMBUS_RDMA_H > -#define _VMBUS_RDMA_H > - > - > -#include <linux/in.h> > -#include <linux/in6.h> > -#include <rdma/ib_verbs.h> > -#include <linux/idr.h> > -#include <linux/if_ether.h> > - > -/* NetworkDirect version Numbers. > - */ > -#define ND_VERSION_1 0x1 > -#define ND_VERSION_2 0x20000 > - > -#ifndef NDVER > -#define NDVER ND_VERSION_2 > -#endif > - > -#define ND_ADAPTER_FLAG_IN_ORDER_DMA_SUPPORTED 0x00000001 > -#define ND_ADAPTER_FLAG_CQ_INTERRUPT_MODERATION_SUPPORTED 0x00000004 > -#define ND_ADAPTER_FLAG_MULTI_ENGINE_SUPPORTED 0x00000008 > -#define ND_ADAPTER_FLAG_CQ_RESIZE_SUPPORTED 0x00000100 > -#define ND_ADAPTER_FLAG_LOOPBACK_CONNECTIONS_SUPPORTED 0x00010000 > - > -#define ND_CQ_NOTIFY_ERRORS 0 > -#define ND_CQ_NOTIFY_ANY 1 > -#define ND_CQ_NOTIFY_SOLICITED 2 > - > -#define ND_MR_FLAG_ALLOW_LOCAL_WRITE 0x00000001 > -#define ND_MR_FLAG_ALLOW_REMOTE_READ 0x00000002 > -#define ND_MR_FLAG_ALLOW_REMOTE_WRITE 0x00000005 > -#define ND_MR_FLAG_RDMA_READ_SINK 0x00000008 > -#define ND_MR_FLAG_DO_NOT_SECURE_VM 0x80000000 > - > -#define ND_OP_FLAG_SILENT_SUCCESS 0x00000001 > -#define ND_OP_FLAG_READ_FENCE 0x00000002 > -#define ND_OP_FLAG_SEND_AND_SOLICIT_EVENT 0x00000004 > -#define ND_OP_FLAG_ALLOW_READ 0x00000008 > -#define ND_OP_FLAG_ALLOW_WRITE 0x00000010 > - > -#if NDVER >= ND_VERSION_2 > -#define ND_OP_FLAG_INLINE 0x00000020 > -#endif > - > -#define ND_AF_INET6 23 > -#define IF_MAX_ADDR_LENGTH 32 > - > -struct group_affinity { > - u64 mask; //KYS: usually 0 > - u16 group; // KYS usually -1 > - u16 reserved[3]; > -}; > - > -struct if_physical_addr { > - u16 length; > - u8 addr[IF_MAX_ADDR_LENGTH]; > -}; > - > -struct adapter_info_v2 { > - u32 info_version; > - u16 vendor_id; > - u16 device_id; > - u64 adapter_id; > - size_t max_registration_size; > - size_t max_window_size; > - u32 max_initiator_sge; > - u32 max_recv_sge; > - u32 max_read_sge; > - u32 max_transfer_length; > - u32 max_inline_data_size; > - u32 max_inbound_read_limit; > - u32 max_outbound_read_limit; > - u32 max_recv_q_depth; > - u32 max_initiator_q_depth; > - u32 max_shared_recv_q_depth; > - u32 max_completion_q_depth; > - u32 inline_request_threshold; > - u32 large_request_threshold; > - u32 max_caller_data; > - u32 max_callee_data; > - u32 adapter_flags; > -} __packed; > - > -struct nd2_adapter_info_32 { //KYS: Check what this is > - u32 info_version; > - u16 vendor_id; > - u16 devic_id; > - u64 adapter_id; > - u32 max_registration_size; > - u32 max_window_size; > - u32 max_initiator_sge; > - u32 max_recv_sge; > - u32 max_read_sge; > - u32 max_transfer_length; > - u32 max_inline_data_size; > - u32 max_inbound_read_limit; > - u32 max_outbound_read_limit; > - u32 max_recv_q_depth; > - u32 max_initiator_q_depth; > - u32 max_shared_recv_q_depth; > - u32 max_completion_q_depth; > - u32 inline_request_threshold; > - u32 large_request_threshold; > - u32 max_caller_data; > - u32 max_callee_data; > - u32 adapter_flags; > -} __packed; > - > -enum nd2_request_type { > - ND2_RT_RECEIVE, > - ND2_RT_SEND, > - ND2_RT_BIND, > - ND2_RT_INVALIDATE, > - ND2_RT_READ, > - ND2_RT_WRITE > -}; > - > -struct nd2_result { > - u32 status; > - u32 bytes_transferred; > - void *qp_ctx; > - void *request_ctx; > - enum nd2_request_type request_type; > -} __packed; > - > -struct nd2_sge { > - void *buffer; > - u32 buffer_length; > - u32 mr_token; > -} __packed; > - > -/* > - * The communication with the host via ioctls using VMBUS > - * as the transport. > - */ > - > -#define ND_IOCTL_VERSION 1 > - > -enum nd_mapping_type { > - ND_MAP_IOSPACE, > - ND_MAP_MEMORY, > - ND_MAP_MEMORY_COALLESCE, > - ND_MAP_PAGES, > - ND_MAP_PAGES_COALLESCE, > - ND_UNMAP_IOSPACE, > - ND_UNMAP_MEMORY, > - ND_MAX_MAP_TYPE > -}; > - > -enum nd_caching_type { > - ND_NON_CACHED = 0, > - ND_CACHED, > - ND_WRITE_COMBINED, > - ND_MAX_CACHE_TYPE > -}; > - > -enum nd_aceess_type { > - ND_READ_ACCESS = 0, > - ND_WRITE_ACCESS, > - ND_MODIFY_ACCESS > -}; > - > -struct nd_map_io_space { > - enum nd_mapping_type map_type; > - enum nd_caching_type cache_type; > - u32 cb_length; > -}; > - > -struct nd_map_memory { > - enum nd_mapping_type map_type; > - enum nd_aceess_type access_type; > - u64 address; > - u32 cb_length; > -}; > - > -struct nd_mapping_id { > - enum nd_mapping_type map_type; > - u64 id; > -}; > - > -struct ndk_map_pages { > - struct nd_map_memory header; > - u32 page_offset; > -}; > - > -union nd_mapping { > - enum nd_mapping_type map_type; > - struct nd_map_io_space map_io_space; > - struct nd_map_memory map_memory; > - struct nd_mapping_id mapping_id; > - struct ndk_map_pages map_pages; > -}; > - > -struct nd_mapping_result { > - u64 id; > - u64 info; > -}; > - > -struct nd_resource_descriptor { > - u64 handle; > - u32 ce_mapping_results; > - u32 cb_mapping_results_offset; > -}; > - > -struct nd_handle { > - u32 version; > - u32 reserved; > - u64 handle; > -}; > - > -union nd_sockaddr_inet { > - struct sockaddr_in ipv4; > - struct sockaddr_in6 ipv6; > - u16 address_family; //KYS how is this supposed to work? > -}; > - > -struct nd_address_element { > - union nd_sockaddr_inet addr; > - char mac_addr[ETH_ALEN]; > -}; > - > -struct nd_resolve_address { > - u32 version; > - u32 reserved; > - union nd_sockaddr_inet address; > -}; > - > -struct nd_open_adapter { > - u32 version; > - u32 reserved; > - u32 ce_mapping_cnt; > - u32 cb_mapping_offset; > - u64 adapter_id; > -}; > - > -struct nd_adapter_query { > - u32 version; > - u32 info_version; > - u64 adapter_handle; > -}; > - > -struct nd_create_cq { > - u32 version; > - u32 queue_depth; > - u32 ce_mapping_cnt; > - u32 cb_mapping_offset; > - u64 adapter_handle; > - struct group_affinity affinity; > -}; > - > -struct nd_create_srq { > - u32 version; > - u32 queue_depth; > - u32 ce_mapping_cnt; > - u32 cb_mapping_offset; > - u32 max_request_sge; > - u32 notify_threshold; > - u64 pd_handle; > - struct group_affinity affinity; > -}; > - > -struct nd_create_qp_hdr { > - u32 version; > - u32 cb_max_inline_data; > - u32 ce_mapping_cnt; > - u32 cb_mapping_offset; //KYS: what is this prefix - ce/cb > - u32 initiator_queue_depth; > - u32 max_initiator_request_sge; > - u64 receive_cq_handle; > - u64 initiator_cq_handle; > - u64 pd_handle; > -}; > - > -struct nd_create_qp { > - struct nd_create_qp_hdr hdr; > - u32 receive_queue_depth; > - u32 max_receive_request_sge; > -}; > - > -struct nd_create_qp_with_srq { > - struct nd_create_qp_hdr header; > - u64 srq_handle; > -}; > - > -struct nd_srq_modify { > - u32 version; > - u32 queue_depth; > - u32 ce_mapping_cnt; > - u32 cb_mapping_offset; > - u32 notify_threshold; > - u32 reserved; > - u64 srq_handle; > -}; > - > -struct nd_cq_modify { > - u32 version; > - u32 queue_depth; > - u32 ce_mapping_count; > - u32 cb_mappings_offset; > - u64 cq_handle; > -}; > - > -struct nd_cq_notify { > - u32 version; > - u32 type; > - u64 cq_handle; > -}; > - > -struct nd_mr_register_hdr { > - u32 version; > - u32 flags; > - u64 cb_length; > - u64 target_addr; > - u64 mr_handle; > -}; > - > -struct nd_mr_register { > - struct nd_mr_register_hdr header; > - u64 address; > -}; > - > -struct nd_bind { > - u32 version; > - u32 reserved; > - u64 handle; > - union nd_sockaddr_inet address; > -}; > - > -struct nd_read_limits { > - u32 inbound; > - u32 outbound; > -}; > - > -struct nd_connect { > - u32 version; > - u32 reserved; > - struct nd_read_limits read_limits; > - u32 cb_private_data_length; > - u32 cb_private_data_offset; > - u64 connector_handle; > - u64 qp_handle; > - union nd_sockaddr_inet destination_address; > - struct if_physical_addr phys_addr; > -}; > - > -struct nd_accept { > - u32 version; > - u32 reserved; > - struct nd_read_limits read_limits; > - u32 cb_private_data_length; > - u32 cb_private_data_offset; > - u64 connector_handle; > - u64 qp_handle; > -}; > - > -struct nd_reject { > - u32 version; > - u32 reserved; > - u32 cb_private_data_length; > - u32 cb_private_data_offset; > - u64 connector_handle; > -}; > - > -struct nd_listen { > - u32 version; > - u32 back_log; > - u64 listener_handle; > -}; > - > -struct nd_get_connection_request { > - u32 version; > - u32 reserved; > - u64 listener_handle; > - u64 connector_handle; > -}; > - > -enum ndv_mmio_type { > - ND_PARTITION_KERNEL_VIRTUAL, > - ND_PARTITION_SYSTEM_PHYSICAL, > - ND_PARTITION_GUEST_PHYSICAL, > - ND_MAXIMUM_MMIO_TYPE > -}; > - > -struct ndv_resolve_adapter_id { > - u32 version; > - struct if_physical_addr phys_addr; > -}; > - > -struct ndv_partition_create { > - u32 version; > - enum ndv_mmio_type mmio_type; > - u64 adapter_id; > - u64 xmit_cap; > -}; > - > -struct ndv_partition_bind_luid { > - u32 version; > - u32 reserved; > - u64 partition_handle; > - struct if_physical_addr phys_addr; > - //IF_LUID luid; //KYS? > -}; > - > -struct ndv_partition_bind_address { > - u32 version; > - u32 reserved; > - u64 partition_handle; > - union nd_sockaddr_inet address; > - struct if_physical_addr guest_phys_addr; > - struct if_physical_addr phys_addr; > -}; > - > -struct ndk_mr_register { > - struct nd_mr_register_hdr hdr; > - u32 cb_logical_page_addresses_offset; > -}; > - > -struct ndk_bind { > - struct nd_bind hdr; > - u64 authentication_id; > - bool is_admin; > -}; > - > -#define FDN 0x12 > -#define METHOD_BUFFERED 0x0 > -#define FAA 0x0 > - > -#define CTL_CODE( DeviceType, Function, Method, Access ) ( \ > - ((DeviceType) << 16) | ((Access) << 14) | ((Function) << 2) | (Method) \ > -) > - > -#define ND_FUNCTION(r_, i_) ((r_) << 6 | (i_)) > -#define IOCTL_ND(r_, i_) \ > - CTL_CODE( FDN, ND_FUNCTION((r_), (i_)), METHOD_BUFFERED, FAA ) > - > -#define ND_FUNCTION_FROM_CTL_CODE(ctrlCode_) ((ctrlCode_ >> 2) & 0xFFF) > -#define ND_RESOURCE_FROM_CTL_CODE(ctrlCode_) (ND_FUNCTION_FROM_CTL_CODE(ctrlCode_) >> 6) > -#define ND_OPERATION_FROM_CTRL_CODE(ctrlCode_) (ND_FUNCTION_FROM_CTL_CODE(ctrlCode_) & 0x3F) > - > -#define ND_DOS_DEVICE_NAME L"\\DosDevices\\Global\\NetworkDirect" > -#define ND_WIN32_DEVICE_NAME L"\\\\.\\NetworkDirect" > - > -enum nd_resource_type { > - ND_PROVIDER = 0, > - ND_ADAPTER, > - ND_PD, > - ND_CQ, > - ND_MR, > - ND_MW, > - ND_SRQ, > - ND_CONNECTOR, > - ND_LISTENER, > - ND_QP, > - ND_VIRTUAL_PARTITION, > - ND_RESOURCE_TYPE_COUNT > -}; > - > -#define ND_OPERATION_COUNT 14 > - > -#define IOCTL_ND_PROVIDER(i_) IOCTL_ND(ND_PROVIDER, i_) > -#define IOCTL_ND_ADAPTER(i_) IOCTL_ND(ND_ADAPTER, i_) > -#define IOCTL_ND_PD(i_) IOCTL_ND(ND_PD, i_) > -#define IOCTL_ND_CQ(i_) IOCTL_ND(ND_CQ, i_) > -#define IOCTL_ND_MR(i_) IOCTL_ND(ND_MR, i_) > -#define IOCTL_ND_MW(i_) IOCTL_ND(ND_MW, i_) > -#define IOCTL_ND_SRQ(i_) IOCTL_ND(ND_SRQ, i_) > -#define IOCTL_ND_CONNECTOR(i_) IOCTL_ND(ND_CONNECTOR, i_) > -#define IOCTL_ND_LISTENER(i_) IOCTL_ND(ND_LISTENER, i_) > -#define IOCTL_ND_QP(i_) IOCTL_ND(ND_QP, i_) > -#define IOCTL_ND_VIRTUAL_PARTITION(i_) IOCTL_ND(ND_VIRTUAL_PARTITION, i_) > - > -/* Provider IOCTLs */ > -#define IOCTL_ND_PROVIDER_INIT IOCTL_ND_PROVIDER( 0 ) > -#define IOCTL_ND_PROVIDER_BIND_FILE IOCTL_ND_PROVIDER( 1 ) > -#define IOCTL_ND_PROVIDER_QUERY_ADDRESS_LIST IOCTL_ND_PROVIDER( 2 ) > -#define IOCTL_ND_PROVIDER_RESOLVE_ADDRESS IOCTL_ND_PROVIDER( 3 ) > -#define IOCTL_ND_PROVIDER_MAX_OPERATION 4 > - > -/* Adapter IOCTLs */ > -#define IOCTL_ND_ADAPTER_OPEN IOCTL_ND_ADAPTER( 0 ) > -#define IOCTL_ND_ADAPTER_CLOSE IOCTL_ND_ADAPTER( 1 ) > -#define IOCTL_ND_ADAPTER_QUERY IOCTL_ND_ADAPTER( 2 ) > -#define IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST IOCTL_ND_ADAPTER( 3 ) > -#define IOCTL_ND_ADAPTER_MAX_OPERATION 4 > - > -/* Protection Domain IOCTLs */ > -#define IOCTL_ND_PD_CREATE IOCTL_ND_PD( 0 ) > -#define IOCTL_ND_PD_FREE IOCTL_ND_PD( 1 ) > -#define IOCTL_ND_PD_MAX_OPERATION 2 > - > -/* Completion Queue IOCTLs */ > -#define IOCTL_ND_CQ_CREATE IOCTL_ND_CQ( 0 ) > -#define IOCTL_ND_CQ_FREE IOCTL_ND_CQ( 1 ) > -#define IOCTL_ND_CQ_CANCEL_IO IOCTL_ND_CQ( 2 ) > -#define IOCTL_ND_CQ_GET_AFFINITY IOCTL_ND_CQ( 3 ) > -#define IOCTL_ND_CQ_MODIFY IOCTL_ND_CQ( 4 ) > -#define IOCTL_ND_CQ_NOTIFY IOCTL_ND_CQ( 5 ) > -#define IOCTL_ND_CQ_MAX_OPERATION 6 > - > -/* Memory Region IOCTLs */ > -#define IOCTL_ND_MR_CREATE IOCTL_ND_MR( 0 ) > -#define IOCTL_ND_MR_FREE IOCTL_ND_MR( 1 ) > -#define IOCTL_ND_MR_CANCEL_IO IOCTL_ND_MR( 2 ) > -#define IOCTL_ND_MR_REGISTER IOCTL_ND_MR( 3 ) > -#define IOCTL_ND_MR_DEREGISTER IOCTL_ND_MR( 4 ) > -#define IOCTL_NDK_MR_REGISTER IOCTL_ND_MR( 5 ) > -#define IOCTL_ND_MR_MAX_OPERATION 6 > - > -/* Memory Window IOCTLs */ > -#define IOCTL_ND_MW_CREATE IOCTL_ND_MW( 0 ) > -#define IOCTL_ND_MW_FREE IOCTL_ND_MW( 1 ) > -#define IOCTL_ND_MW_MAX_OPERATION 2 > - > -/* Shared Receive Queue IOCTLs */ > -#define IOCTL_ND_SRQ_CREATE IOCTL_ND_SRQ( 0 ) > -#define IOCTL_ND_SRQ_FREE IOCTL_ND_SRQ( 1 ) > -#define IOCTL_ND_SRQ_CANCEL_IO IOCTL_ND_SRQ( 2 ) > -#define IOCTL_ND_SRQ_GET_AFFINITY IOCTL_ND_SRQ( 3 ) > -#define IOCTL_ND_SRQ_MODIFY IOCTL_ND_SRQ( 4 ) > -#define IOCTL_ND_SRQ_NOTIFY IOCTL_ND_SRQ( 5 ) > -#define IOCTL_ND_SRQ_MAX_OPERATION 6 > - > -/* Connector IOCTLs */ > -#define IOCTL_ND_CONNECTOR_CREATE IOCTL_ND_CONNECTOR( 0 ) > -#define IOCTL_ND_CONNECTOR_FREE IOCTL_ND_CONNECTOR( 1 ) > -#define IOCTL_ND_CONNECTOR_CANCEL_IO IOCTL_ND_CONNECTOR( 2 ) > -#define IOCTL_ND_CONNECTOR_BIND IOCTL_ND_CONNECTOR( 3 ) > -#define IOCTL_ND_CONNECTOR_CONNECT IOCTL_ND_CONNECTOR( 4 ) > -#define IOCTL_ND_CONNECTOR_COMPLETE_CONNECT IOCTL_ND_CONNECTOR( 5 ) > -#define IOCTL_ND_CONNECTOR_ACCEPT IOCTL_ND_CONNECTOR( 6 ) > -#define IOCTL_ND_CONNECTOR_REJECT IOCTL_ND_CONNECTOR( 7 ) > -#define IOCTL_ND_CONNECTOR_GET_READ_LIMITS IOCTL_ND_CONNECTOR( 8 ) > -#define IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA IOCTL_ND_CONNECTOR( 9 ) > -#define IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS IOCTL_ND_CONNECTOR( 10 ) > -#define IOCTL_ND_CONNECTOR_GET_ADDRESS IOCTL_ND_CONNECTOR( 11 ) > -#define IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT IOCTL_ND_CONNECTOR( 12 ) > -#define IOCTL_ND_CONNECTOR_DISCONNECT IOCTL_ND_CONNECTOR( 13 ) > -#define IOCTL_ND_CONNECTOR_MAX_OPERATION 14 > - > -/* Listener IOCTLs */ > -#define IOCTL_ND_LISTENER_CREATE IOCTL_ND_LISTENER( 0 ) > -#define IOCTL_ND_LISTENER_FREE IOCTL_ND_LISTENER( 1 ) > -#define IOCTL_ND_LISTENER_CANCEL_IO IOCTL_ND_LISTENER( 2 ) > -#define IOCTL_ND_LISTENER_BIND IOCTL_ND_LISTENER( 3 ) > -#define IOCTL_ND_LISTENER_LISTEN IOCTL_ND_LISTENER( 4 ) > -#define IOCTL_ND_LISTENER_GET_ADDRESS IOCTL_ND_LISTENER( 5 ) > -#define IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST IOCTL_ND_LISTENER( 6 ) > -#define IOCTL_ND_LISTENER_MAX_OPERATION 7 > - > -/* Queue Pair IOCTLs */ > -#define IOCTL_ND_QP_CREATE IOCTL_ND_QP( 0 ) > -#define IOCTL_ND_QP_CREATE_WITH_SRQ IOCTL_ND_QP( 1 ) > -#define IOCTL_ND_QP_FREE IOCTL_ND_QP( 2 ) > -#define IOCTL_ND_QP_FLUSH IOCTL_ND_QP( 3 ) > -#define IOCTL_ND_QP_MAX_OPERATION 4 > - > -/* Kernel-mode only IOCTLs (IRP_MJ_INTERNAL_DEVICE_CONTROL) */ > -#define IOCTL_NDV_PARTITION_RESOLVE_ADAPTER_ID IOCTL_ND_VIRTUAL_PARTITION( 0 ) > -#define IOCTL_NDV_PARTITION_CREATE IOCTL_ND_VIRTUAL_PARTITION( 1 ) > -#define IOCTL_NDV_PARTITION_FREE IOCTL_ND_VIRTUAL_PARTITION( 2 ) > -#define IOCTL_NDV_PARTITION_BIND IOCTL_ND_VIRTUAL_PARTITION( 3 ) > -#define IOCTL_NDV_PARTITION_UNBIND IOCTL_ND_VIRTUAL_PARTITION( 4 ) > -#define IOCTL_NDV_PARTITION_BIND_LUID IOCTL_ND_VIRTUAL_PARTITION( 5 ) > -#define IOCTL_NDV_PARTITION_MAX_OPERATION 6 > - > - > -#define MB_SHIFT 20 > - > - > -/* Ringbuffer size for the channel */ > -#define NDV_NUM_PAGES_IN_RING_BUFFER 64 > - > -#define NDV_MAX_PACKETS_PER_RECEIVE 8 > - > -#define NDV_MAX_PACKET_COUNT 16304 > - > -#define NDV_MAX_NUM_OUTSTANDING_RECEIVED_PACKETS (16304) > -#define NDV_MAX_HANDLE_TABLE_SIZE (16304) > -#define NDV_HOST_MAX_HANDLE_TABLE_SIZE (NDV_MAX_HANDLE_TABLE_SIZE * 16) > - > - > -#define NDV_MAX_MAPPINGS 4 > - > -#define NDV_STATE_NONE 0x00000000 > -#define NDV_STATE_CREATED 0x00000001 > -#define NDV_STATE_CONNECTING 0x00000002 > -#define NDV_STATE_INITIALIZING 0x00000003 > -#define NDV_STATE_OPERATIONAL 0xEFFFFFFF > -#define NDV_STATE_FAILED 0xFFFFFFFF > - > - > -#define NDV_MAX_PRIVATE_DATA_SIZE 64 > -#define NDV_MAX_IOCTL_SIZE 256 > - > -/* max size of buffer for vector of ND_MAPPING */ > -#define NDV_MAX_MAPPING_BUFFER_SIZE \ > - (NDV_MAX_MAPPINGS * sizeof(union nd_mapping)) > - > -/* max expected ioctl buffer size from users */ > -#define NDV_MAX_IOCTL_BUFFER_SIZE \ > - (NDV_MAX_IOCTL_SIZE + \ > - NDV_MAX_MAPPING_BUFFER_SIZE + \ > - NDV_MAX_PRIVATE_DATA_SIZE) > - > -/* max PFN array for inline buffers */ > -#define NDV_MAX_INLINE_PFN_ARRAY_LENGTH 32 > - > -/* Field header size for inline buffer */ > -#define NDV_MAX_MAPPING_PACKET_FILED_BUFFER_SIZE \ > - (NDV_MAX_MAPPINGS * sizeof(NDV_PACKET_FIELD)) > - > -/* Max for a single field */ > - > -#define NDV_MAX_SINGLE_MAPPING_FIELD ( sizeof(GPA_RANGE) + \ > - (sizeof(PFN_NUMBER) * NDV_MAX_INLINE_PFN_ARRAY_LENGTH)) > - > -/* Max for all inine data */ > - > -#define NDV_MAX_MAPPING_DATA_SIZE (NDV_MAX_MAPPING_PACKET_FILED_BUFFER_SIZE + \ > - (NDV_MAX_MAPPINGS * NDV_MAX_SINGLE_MAPPING_FIELD)) > - > - > -#define NDV_MAX_PACKET_HEADER_SIZE 256 > - > -#define NDV_MAX_PACKET_SIZE (NDV_MAX_PACKET_HEADER_SIZE + \ > - NDV_MAX_IOCTL_BUFFER_SIZE + \ > - NDV_MAX_MAPPING_DATA_SIZE) > - > -/* Well known message type INIT is defined for the channel > - * not for the protocol. > - */ > - > -#define NDV_PACKET_TYPE_INIT 0xFFFFFFFF > - > -/* Invalid protocol version to to identify uninitialized channels */ > - > -#define NDV_PROTOCOL_VERSION_INVALID 0xFFFFFFFF > - > -/* Flags that control the bahavior of packet handling */ > - > -enum ndv_packet_options { > - NDV_PACKET_OPTION_NONE = 0x00, > - > - /* Indicates that the ExternalDataMdl parameter is expectected to be > - * passed and must be handled in the reciever. This call must be > - * handled specially to ensure that the MDL can be created correctly. > - */ > - NDV_PACKET_OPTION_EXTERNAL_DATA = 0x01, > - > - /* Inicates that the reciever must execution the handler at passive. */ > - NDV_PACKET_OPTIONS_REQUIRES_PASSIVE = 0x02, > - > - /* Indicates that the sender does not expect and is not waiting for a > - * response packet. > - */ > - NDV_PACKET_OPTIONS_POST = 0x04, > -}; > - > -#define NDV_PACKET_TYPE(id_, opt_) \ > - (((opt_)<<24) | (id_)) > - > -#define NDV_PACKET_TYPE_OPTIONS(type_) \ > - (((type_) >> 24) & 0xFF) > - > -#define NDV_PACKET_TYPE_ID(type_) \ > - ((type_) & 0xFFFFFF) \ > - > -#define NDV_ADD_PACKET_OPTION(type_, opt_) \ > - (type_) |= (opt_<<24) > - > -/* The header value sent on all packets */ > -union ndv_packet_hdr { > - > - struct { > - /* The type of packet. > - * This value should be created with the NDV_PACKET_TYPE macro > - * to include all packet options within the packet type. > - */ > - u32 packet_type; > - /* The size of the entire fixed message structure that exists > - * before the data. This must be >= sizeof(NDV_PACKET_HEADER) > - */ > - u32 hdr_sz; > - /* This size of the data that follows the message > - * data_sz + hdr_sz size gives the total size of > - * the buffer that is used. > - */ > - u32 data_sz; > - /* The status code used to indicate success or failure. > - * It is only used in completions and during responses. > - */ > - u32 status; //KYS: NTSTATUS? > - }; > - > - u64 padding[2]; //KYS: why? > -}; > - > - > -/* The core INIT packet. This message is defined in the channel > - * not in the protocol. This message should never change size > - * or behavior, as it could impact compatibility in the future. > - * This packet is used to negotiate the protocol version, so chaning > - * this size could break backward compat. > - */ > - > -union ndv_packet_init { > - struct { > - u32 packet_type; > - u32 protocol_version; > - u32 flags; > - }; > - u64 padding[2]; > -} __packed; > - > -#define NDV_PACKET_INIT_SIZE 16 > - > -/* Data packing flags used for accessing the dynamic fields inside a packet */ > -#define NDV_DATA_PACKING_2 0x1 > -#define NDV_DATA_PACKING_4 0x3 > -#define NDV_DATA_PACKING_8 0x7 > - > - > -#define NDV_PROTOCOL_VERSION_1 0x0100 > -#define NDV_PROTOCOL_VERSION_CURRENT NDV_PROTOCOL_VERSION_1 > -#define NDV_PROTOCOL_VERSION_COUNT 1 > - > -struct ndv_pkt_field { > - u32 size; > - u32 offset; > -}; > - > -enum ndv_pkt_id { > - NDV_PKT_UNKNOWN = 0, > - /* Version 1 Message ID's */ > - NDV_PKT_ID1_BIND, > - NDV_PKT_ID1_CREATE, > - NDV_PKT_ID1_CLEANUP, > - NDV_PKT_ID1_CANCEL, > - NDV_PKT_ID1_CONTROL, > - NDV_PKT_ID1_COMPLETE, > - NDV_PKT_ID1_INIT_RESOURCES, > -}; > - > -/* The guest will send this as the first messages just after init > - * The resources are reserved per channel. > - */ > - > -struct ndv_pkt_hdr_init_resources_1 { > - > - union ndv_packet_hdr pkt_hdr; > - u16 io_space_sz_mb; > - u64 io_space_start; > - > -}; > - > - > - > -/* The guest will send this packet to the host after channel init > - * to query support for the adapters that are registered. > - */ > - > -struct ndv_pkt_hdr_bind_1 { > - union ndv_packet_hdr pkt_hdr; > - bool unbind; > - union nd_sockaddr_inet ip_address; > - struct if_physical_addr phys_addr; > - u64 guest_id; > -}; > - > -union ndv_context_handle { > - u64 val64; > - struct { > - u32 local; > - u32 remote; > - }; > -}; > - > -struct ndv_pkt_hdr_create_1 { > - union ndv_packet_hdr pkt_hdr; > - > - /* Identifies the object used to track this file handle on both > - * the guest and the host. When sent from the guest, it will contain > - * the guest handle. On success, the host will populate and return > - * it's handle value as well. > - */ > - > - union ndv_context_handle handle; > - > - /* The parameters sent to the CreateFile call */ > - u32 access_mask; > - u32 open_options; > - > - u16 file_attributes; //KYS: This field must be 64 bit aligned > - > - u16 share_access; //KYS > - > - u32 kys_padding; //KYS > - > - u16 ea_length; //KYS; needs to be 64 bit aligned; what is ea length - unused > -}; > - > - > -struct ndv_pkt_hdr_cleanup_1 { > - union ndv_packet_hdr pkt_hdr; > - > - /* Identifies the object used to track this file handle on both > - * the guest and the host. When sent from the guest, it will contain > - * the both the guest and host handle values. The host will use this > - * value to cleanup its resource, then update its portion of the handle > - * to NDV_HANDLE_NULL before returning the data back to the guest. > - */ > - union ndv_context_handle handle; > -}; > - > -struct ndv_pkt_hdr_cancel_1 { > - union ndv_packet_hdr pkt_hdr; > - union ndv_context_handle file_handle; > - union ndv_context_handle irp_handle; > -}; > - > -struct ndv_bind_port_info { > - //LUID authentication_id; //KYS: LUID? > - bool is_admin; > -}; > - > -struct ndv_extended_data_flds { > - union { > - u32 field_count; > - u64 padding; > - }; > - > - //struct ndv_pkt_field fields[ANYSIZE_ARRAY]; //KYS? > -}; > - > - > -struct ndv_packet_hdr_control_1 { > - union ndv_packet_hdr pkt_hdr; > - /* Identifies the object used to track this file handle on both > - * the guest and the host. This should always have both guest > - * and host handle values inside it. > - */ > - > - union ndv_context_handle file_handle; > - > - /* The handle information for the allocated irp context object. > - * This information is used when the host/guest starts the cancelation > - */ > - union ndv_context_handle irp_handle; > - > - /* The input data describing in the IO control parameters */ > - > - u32 io_cntrl_code; > - u32 output_buf_sz; > - u32 input_buf_sz; > - u32 input_output_buf_offset; > - > - /* These are used in the return message to indicate the status of the IO > - * operation and the amount of data written to the output buffer. > - */ > - u32 io_status; //KYS: NTSTATUS? > - u32 bytes_returned; > - > - /* This contains the field information for additional data that is sent > - * with the packet that is IOCTL specific. > - */ > - > - struct ndv_pkt_field extended_data; > -}; > - > -/* > - * Include MLX specific defines. > - */ > - > -#include "mx_abi.h" > - > -/* Driver specific state. > - */ > - > -/* > - * We need to have host open a file; some > - * Windows constants for open. > - */ > -#define STANDARD_RIGHTS_ALL (0x001F0000L) > -#define FILE_ATTRIBUTE_NORMAL (0x80) > -#define FILE_SHARE_READ (0x00000001) > -#define FILE_SHARE_WRITE (0x00000002) > -#define FILE_SHARE_DELETE (0x00000004) > -#define FILE_FLAG_OVERLAPPED (0x40000000) > -#define FILE_SHARE_ALL (FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE) > -#define CREATE_ALWAYS (2) > -#define OPEN_EXISTING (3) > - > -#define RTL_NUMBER_OF(_x) \ > - sizeof(_x)/sizeof(_x[0]) > -/* > - * The context structure tracks the open state. > - */ > - > -/* > - * Packet layout for open adaptor. > - */ > - > -/* > - * Packet for querying the address list. > - */ > - > -union query_addr_list_ioctl { > - struct nd_handle in; > - union nd_sockaddr_inet out[16]; //KYS a max of 16 addresses > -}; > - > -struct pkt_query_addr_list { > - struct ndv_packet_hdr_control_1 hdr; > - union query_addr_list_ioctl ioctl; > - unsigned long activity_id; > -}; > - > - > -struct pkt_fld { > - u32 size; > - u32 offset; > -}; > - > -struct fld_data { > - union { > - u64 padding; > - }; > -}; > - > -struct extended_data_oad { > - union { > - u32 cnt; > - u64 padding; > - }; > - /* offsets are from start of extended data struct > - * and should start on 8 byte boundary > - */ > - struct pkt_fld fields[IBV_GET_CONTEXT_MAPPING_MAX]; > -}; > - > -union oad_ioctl { > - struct nd_open_adapter input; > - struct nd_resource_descriptor resrc_desc; > -}; > - > -union oad_mappings { > - struct ibv_get_context_req ctx_input; > - struct ibv_get_context_resp ctx_output; > -}; > - > -struct pkt_nd_open_adapter { > - struct ndv_packet_hdr_control_1 hdr; > - > - union oad_ioctl ioctl; > - union oad_mappings mappings; > - > - /* > - * Extended data. > - */ > - struct extended_data_oad ext_data; > -}; > - > -/* > - * Create CQ IOCTL. > - */ > - > -struct cq_db_gpa { > - u32 byte_count; > - u32 byte_offset; > - u64 pfn_array[2]; > -}; > - > -struct cq_sn_gpa { > - u32 byte_count; > - u32 byte_offset; > - u64 pfn_array[2]; > -}; > - > -struct create_cq_ext_data { > - union { > - u32 cnt; > - u64 padding; > - }; > - /* offsets are from start of extended data struct > - * and should start on 8 byte boundary > - */ > - struct pkt_fld fields[MLX4_IB_CREATE_CQ_MAPPING_MAX]; > - struct cq_db_gpa db_gpa; > - struct cq_sn_gpa sn_gpa; > - struct gpa_range cqbuf_gpa; > -}; > - > -union create_cq_ioctl { > - struct nd_create_cq input; > - struct nd_resource_descriptor resrc_desc; > -}; > - > -union create_cq_mappings { > - struct ibv_create_cq cq_in; > - struct ibv_create_cq_resp cq_resp; > -}; > - > -struct pkt_nd_create_cq { > - struct ndv_packet_hdr_control_1 hdr; > - > - union create_cq_ioctl ioctl; > - union create_cq_mappings mappings; > - > - /* > - * Extended data. > - */ > - struct create_cq_ext_data ext_data; > -}; > - > -/* > - * IOCTL to free CQ. > - */ > -struct free_cq_ioctl { > - struct nd_handle in; > -}; > - > -struct pkt_nd_free_cq { > - struct ndv_packet_hdr_control_1 hdr; > - > - struct free_cq_ioctl ioctl; > -}; > - > - > -/* > - * IOCTL to QUERY CQ - CQ NOTIFY > - */ > - > -struct notify_cq_ioctl { > - struct nd_cq_notify in; > -}; > - > -struct pkt_nd_notify_cq { > - struct ndv_packet_hdr_control_1 hdr; > - struct notify_cq_ioctl ioctl; > -}; > - > -/* > - * IOCTL to Create a listner > - */ > - > -struct nd_ep_create { > - struct nd_handle hdr; > - bool to_semantics; > - unsigned long activity_id; > -}; > - > -union listener_cr_ioctl { > - struct nd_ep_create in; > - u64 out; > -}; > - > -struct pkt_nd_cr_listener { > - struct ndv_packet_hdr_control_1 hdr; > - union listener_cr_ioctl ioctl; > -}; > - > -/* > - * IOCTL to free listener. > - */ > - > -struct listener_free_ioctl { > - struct nd_handle in; > -}; > - > -struct pkt_nd_free_listener { > - struct ndv_packet_hdr_control_1 hdr; > - struct listener_free_ioctl ioctl; > -}; > - > -/* > - * IOCTL for listener cancel IO. > - */ > -struct listener_cancelio_ioctl { > - struct nd_handle in; > -}; > - > -struct pkt_nd_cancelio_listener { > - struct ndv_packet_hdr_control_1 hdr; > - struct listener_cancelio_ioctl ioctl; > -}; > - > -/* > - * IOCTL for LISTENER BIND > - */ > - > -union listener_bind_ioctl { > - struct ndk_bind in; > -}; > - > -struct pkt_nd_bind_listener { > - struct ndv_packet_hdr_control_1 hdr; > - union listener_bind_ioctl ioctl; > -}; > - > -/* > - * After the listener is bound, enable > - * listening. > - */ > - > -union listener_listen_ioctl { > - struct nd_listen in; > -}; > - > -struct pkt_nd_listen_listener { > - struct ndv_packet_hdr_control_1 hdr; > - union listener_listen_ioctl ioctl; > -}; > - > -/* > - * IOCTL for getting the adddress from listener. > - * > - */ > - > -union listener_get_addr_ioctl { > - struct nd_handle in; > - union nd_sockaddr_inet out; > -}; > - > -struct pkt_nd_get_addr_listener { > - struct ndv_packet_hdr_control_1 hdr; > - union listener_get_addr_ioctl ioctl; > -}; > - > -/* > - * IOCTL to get a connection from a listener. > - */ > - > -union listener_get_connection_ioctl { > - struct nd_get_connection_request in; > - union nd_sockaddr_inet out; > -}; > - > -struct pkt_nd_get_connection_listener { > - struct ndv_packet_hdr_control_1 hdr; > - union listener_get_connection_ioctl ioctl; > -}; > - > - > -/* > - * Connector IOCTLs > - */ > - > -/* > - * IOCTL to create connector. > - */ > - > -union connector_cr_ioctl { //KYS should this be a union or struct? > - struct nd_ep_create in; > - u64 out; > -}; > - > -struct pkt_nd_cr_connector { > - struct ndv_packet_hdr_control_1 hdr; > - union connector_cr_ioctl ioctl; //KYS: union or struct > -}; > - > -/* > - * IOCTL to free connector. > - */ > - > -struct connector_free_ioctl { > - struct nd_handle in; > -}; > - > -struct pkt_nd_free_connector { > - struct ndv_packet_hdr_control_1 hdr; > - struct connector_free_ioctl ioctl; > -}; > - > -/* > - * IOCTL to cancel I/O on a connector. > - */ > - > -struct connector_cancelio_ioctl { > - struct nd_handle in; > -}; > - > -struct pkt_nd_cancelio_connector { > - struct ndv_packet_hdr_control_1 hdr; > - struct connector_cancelio_ioctl ioctl; > -}; > - > -/* > - * IOCTL to Bind an address to the connector. > - */ > - > -union connector_bind_ioctl { > - struct ndk_bind in; > -}; > - > -struct pkt_nd_bind_connector { > - struct ndv_packet_hdr_control_1 hdr; > - union connector_bind_ioctl ioctl; > -}; > - > -/* > - * IOCTL to connect a connector. > - */ > - > -struct connector_connect_in { > - struct nd_connect hdr; > - u8 retry_cnt; > - u8 rnr_retry_cnt; > - u8 priv_data[56]; > - unsigned long activity_id; > -}; > - > -union connector_connect_ioctl { > - struct connector_connect_in in; > -}; > - > -struct pkt_nd_connector_connect { > - struct ndv_packet_hdr_control_1 hdr; > - union connector_connect_ioctl ioctl; > -}; > - > -/* > - * IOCTL for connector complete connect > - */ > - > -struct complete_connect_in { > - struct nd_handle hdr; > - u8 rnr_nak_to; > - unsigned long activity_id; > -}; > - > -struct complete_connect_out { > - enum ibv_qp_state state; > -}; > - > -union connector_complete_connect_ioctl { > - struct complete_connect_in in; > - struct complete_connect_out out; > -}; > - > -struct pkt_nd_connector_connect_complete { > - struct ndv_packet_hdr_control_1 hdr; > - union connector_complete_connect_ioctl ioctl; > -}; > - > - > -#define MAX_PRIVATE_DATA_LEN 148 > - > -/* > - * IOCTL for connector accept. > - */ > - > -struct connector_accept_in { > - struct nd_accept hdr; > - u8 rnr_retry_cnt; > - u8 rnr_nak_to; > - u8 private_data[MAX_PRIVATE_DATA_LEN]; > - unsigned long activity_id; > -}; > - > -struct connector_accept_out { > - enum ibv_qp_state state; > -}; > - > -union connector_accept_ioctl { > - struct connector_accept_in in; > - struct connector_accept_out out; > -}; > - > -struct pkt_nd_connector_accept { > - struct ndv_packet_hdr_control_1 hdr; > - union connector_accept_ioctl ioctl; > -}; > - > -/* > - * IOCTL for connector to reject a connection. > - */ > - > -struct connector_reject_in { > - struct nd_reject hdr; > - u8 private_data[MAX_PRIVATE_DATA_LEN]; > -}; > - > -struct connector_reject_out { > - enum ibv_qp_state state; > -}; > - > -union connector_reject_ioctl { > - struct connector_reject_in in; > - struct connector_reject_out out; > -}; > - > -struct pkt_nd_connector_reject { > - struct ndv_packet_hdr_control_1 hdr; > - union connector_reject_ioctl ioctl; > -}; > - > -/* > - * IOCTL to get connector read limits. > - */ > - > -struct connector_get_rd_limits_in { > - struct nd_handle in; > -}; > - > -struct connector_get_rd_limits_out { > - struct nd_read_limits out; > -}; > - > -union connector_get_rd_limits_ioctl { > - struct connector_get_rd_limits_in in; > - struct connector_get_rd_limits_out out; > -}; > - > -struct pkt_nd_connector_get_rd_limits { > - struct ndv_packet_hdr_control_1 hdr; > - union connector_get_rd_limits_ioctl ioctl; > -}; > - > -/* > - * IOCTL to get connector private data. > - */ > -union connector_get_priv_data_ioctl { > - struct nd_handle in; > - u8 out[MAX_PRIVATE_DATA_LEN]; > -}; > - > -struct pkt_nd_connector_get_priv_data { > - struct ndv_packet_hdr_control_1 hdr; > - union connector_get_priv_data_ioctl ioctl; > -}; > - > - > -/* > - * IOCTL get peer address. > - */ > - > -union connector_get_peer_addr_ioctl { > - struct nd_handle in; > - union nd_sockaddr_inet out; > -}; > - > -struct pkt_nd_connector_get_peer_addr { > - struct ndv_packet_hdr_control_1 hdr; > - union connector_get_peer_addr_ioctl ioctl; > -}; > - > -/* > - * IOCTL to get connector address. > - */ > - > -union connector_get_addr_ioctl { > - struct nd_handle in; > - union nd_sockaddr_inet out; > -}; > - > -struct pkt_nd_connector_get_addr { > - struct ndv_packet_hdr_control_1 hdr; > - union connector_get_addr_ioctl ioctl; > -}; > - > -/* > - * IOCTL for disconnect notification. > - */ > - > -union connector_notify_disconnect_ioctl { > - struct nd_handle in; > -}; > - > -struct pkt_nd_connector_notify_disconnect { > - struct ndv_packet_hdr_control_1 hdr; > - union connector_notify_disconnect_ioctl ioctl; > -}; > - > -union connector_disconnect_ioctl { > - struct nd_handle in; > -}; > - > -struct pkt_nd_connector_disconnect { > - struct ndv_packet_hdr_control_1 hdr; > - union connector_notify_disconnect_ioctl ioctl; > -}; > - > -/* > - * IOCTLs for QP operations. > - */ > - > -/* > - * Create qp IOCTL. > - */ > - > -struct qp_db_gpa { > - u32 byte_count; > - u32 byte_offset; > - u64 pfn_array[1]; > -}; > - > -struct create_qp_ext_data { > - union { > - u32 cnt; > - u64 padding; > - }; > - /* offsets are from start of extended data struct > - * and should start on 8 byte boundary > - */ > - struct pkt_fld fields[MLX4_IB_CREATE_QP_MAPPINGS_MAX]; > - struct qp_db_gpa db_gpa; > - struct gpa_range qpbuf_gpa; > -}; > - > -union create_qp_ioctl { > - struct nd_create_qp input; > - struct nd_resource_descriptor resrc_desc; > -}; > - > -union create_qp_mappings { > - struct ibv_create_qp qp_in; > - struct ibv_create_qp_resp qp_resp; > -}; > - > -struct pkt_nd_create_qp { > - struct ndv_packet_hdr_control_1 hdr; > - > - union create_qp_ioctl ioctl; > - union create_qp_mappings mappings; > - > - /* > - * Extended data. > - */ > - struct create_qp_ext_data ext_data; > -}; > - > -/* > - * IOCTL to flush a QP. > - */ > -struct flush_qp_ioctl { > - struct nd_handle in; > - enum ibv_qp_state out; > -}; > - > -struct pkt_nd_flush_qp { > - struct ndv_packet_hdr_control_1 hdr; > - struct flush_qp_ioctl ioctl; > -}; > - > -/* > - * Memory Region IOCTLS > - */ > -union create_mr_ioctl { > - struct nd_handle in; > - u64 out; > -}; > - > -struct pkt_nd_create_mr { > - struct ndv_packet_hdr_control_1 hdr; > - union create_mr_ioctl ioctl; > -}; > - > -struct mr_out { > - u32 lkey; > - u32 rkey; > - unsigned long activity_id; > -}; > - > - > -union register_mr_ioctl { > - struct nd_mr_register in; > - struct mr_out out; > -}; > - > -struct pkt_nd_register_mr { > - struct ndv_packet_hdr_control_1 hdr; > - union register_mr_ioctl ioctl; > -}; > - > -struct deregister_mr_ioctl { > - struct nd_handle in; > -}; > - > -struct pkt_nd_deregister_mr { > - struct ndv_packet_hdr_control_1 hdr; > - struct deregister_mr_ioctl ioctl; > -}; > - > -/* > - * IOCTL to disconnect connector > - */ > - > -/* > - * Create PD IOCTL. > - */ > -struct nd_create_pd_ioctl { > - union { > - struct nd_handle in; > - u64 out_handle; > - }; > - struct ibv_alloc_pd_resp resp; > -}; > - > -struct pkt_nd_pd_create { > - struct ndv_packet_hdr_control_1 hdr; > - struct nd_create_pd_ioctl ioctl; > -}; > - > -/* > - * Free Handle. Check the layout with Luke. > - * > - */ > -struct free_handle_ioctl { > - struct nd_handle in; > -}; > - > -struct pkt_nd_free_handle { > - struct ndv_packet_hdr_control_1 hdr; > - struct free_handle_ioctl ioctl; > -}; > - > -/* > - * Cancel I/O. > - */ > - > -struct cancel_io_ioctl { > - struct nd_handle in; > -}; > - > -struct pkt_nd_cancel_io { > - struct ndv_packet_hdr_control_1 hdr; > - struct cancel_io_ioctl ioctl; > -}; > - > -/* > - * Connector states: > - */ > - > -enum connector_state { > - HVND_CON_INCOMING, > - HVND_CON_INCOMING_ESTABLISHED, > - HVND_CON_INCOMING_REJECTED, > - HVND_CON_OUTGOING_REQUEST > -}; > - > - > -/* > - * Adaptor query IOCTL. > - */ > -struct nd_adap_query_ioctl { > - union { > - struct nd_adapter_query ad_q; > - struct adapter_info_v2 ad_info; > - }; > -}; > - > -struct pkt_nd_query_adaptor { > - struct ndv_packet_hdr_control_1 hdr; > - struct nd_adap_query_ioctl ioctl; > -}; > - > -struct nd_ioctl { > - union { > - struct nd_handle handle; > - u8 raw_buffer[NDV_MAX_IOCTL_BUFFER_SIZE]; > - }; > -}; > - > -struct pkt_nd_provider_ioctl { > - struct ndv_packet_hdr_control_1 hdr; > - struct nd_ioctl ioctl; > -}; > - > -struct hvnd_ib_pd { > - struct ib_pd ibpd; > - u32 pdn; > - u64 handle; > -}; > - > -struct hvnd_work { > - struct work_struct work; > - void *callback_arg; > -}; > - > -struct hvnd_disconnect_work { > - struct work_struct work; > - int status; > - void *callback_arg; > -}; > - > -/* > -struct hvnd_delayed_work { > - struct delayed_work work; > - void *callback_arg; > -}; > -*/ > - > -enum hvnd_cm_state { > - hvnd_cm_idle = 0, > - hvnd_cm_connect_reply_sent, //active > - hvnd_cm_connect_reply_refused, > - hvnd_cm_connect_received, //active > - hvnd_cm_connect_request_sent, //passive > - hvnd_cm_accept_sent, > - hvnd_cm_close_sent, > - hvnd_cm_established_sent, > -}; > - > -struct incoming_pkt { > - struct list_head list_entry; > - char pkt[0]; > -}; > - > -struct hvnd_ep_obj { > -/* > - spinlock_t ep_lk; > - bool to_be_destroyed; > - bool io_outstanding; > - > - wait_queue_head_t wait; > - bool stopped; > - atomic_t process_refcnt; // how many NDV_PKT_ID1_COMPLETE packets we are currently processing > -*/ > - bool stopping; > - wait_queue_head_t wait_pending; > - atomic_t nr_requests_pending; > - > - enum nd_resource_type type; > - enum connector_state state; //KYS need to look at locking > - struct iw_cm_id *cm_id; > - enum hvnd_cm_state cm_state; > - struct completion block_event; > - struct completion disconnect_event; > - struct completion connector_accept_event; > - int connector_accept_status; > - u64 ep_handle; > - spinlock_t incoming_pkt_list_lock; > - struct list_head incoming_pkt_list; > - struct hvnd_ep_obj *parent; > - struct hvnd_dev *nd_dev; > - struct hvnd_ucontext *uctx; > - struct hvnd_work wrk; > - struct hvnd_cq *cq; > - u8 ord; > - u8 ird; > - char priv_data[MAX_PRIVATE_DATA_LEN]; > - bool incoming; > - atomic_t disconnect_notified; > - u64 outstanding_handle; > - u32 local_irp; > - struct hvnd_ep_obj *outstanding_ep; > - struct pkt_nd_connector_connect connector_connect_pkt; > - int connector_connect_retry; > -}; > - > -struct hvnd_ucontext { > - struct ib_ucontext ibucontext; > - struct list_head listentry; > - struct ndv_pkt_hdr_create_1 create_pkt; > - struct ndv_pkt_hdr_create_1 create_pkt_ovl; /* Overlap handle */ > - struct pkt_nd_provider_ioctl pr_init_pkt; > - union ndv_context_handle file_handle; > - union ndv_context_handle file_handle_ovl; > - > - struct pkt_nd_open_adapter o_adap_pkt; > - > - u64 adaptor_hdl; > - > - /* > - * Protection domain state. > - */ > - struct pkt_nd_pd_create pd_cr_pkt; > - > - u64 uar_base; > - u64 bf_base; > - u32 bf_buf_size; > - u32 bf_offset; > - u32 cqe_size; > - u32 max_qp_wr; > - u32 max_sge; > - u32 max_cqe; > - u32 num_qps; > - > - /* > - * State to manage dorbell pages: > - */ > - struct list_head db_page_list; > - struct mutex db_page_mutex; > - > - atomic_t refcnt; > - > -}; > - > -struct hvnd_dev { > - struct ib_device ibdev; > - struct hv_device *hvdev; > - u32 device_cap_flags; > - unsigned char nports; > - bool ib_active; > - > - /* State to manage interaction with the host. > - */ > - > - spinlock_t uctxt_lk; > - struct list_head listentry; > - > - unsigned long mmio_sz; > - unsigned long mmio_start_addr; > - struct resource mmio_resource; > - void *mmio_virt; > - > - unsigned long negotiated_version; > - union ndv_packet_init init_pkt; > - struct ndv_pkt_hdr_init_resources_1 resources; > - struct ndv_pkt_hdr_bind_1 bind_pkt; > - > - struct ndv_pkt_hdr_create_1 global_create_pkt; > - union ndv_context_handle global_file_handle; > - > - struct semaphore query_pkt_sem; > - bool query_pkt_set; > - struct pkt_nd_query_adaptor query_pkt; > - > - /* > - * ID tables. > - */ > - spinlock_t id_lock; > - > - struct idr cqidr; > - struct idr qpidr; > - struct idr mmidr; > - struct idr irpidr; > - struct idr uctxidr; > - atomic_t open_cnt; > - > - char ip_addr[4]; > - char mac_addr[6]; > - struct completion addr_set; > - int bind_complete; > - struct mutex bind_mutex; > -}; > - > -struct hvnd_cq { > - struct ib_cq ibcq; > - void *cq_buf; > - void *db_addr; > - u32 arm_sn; > - u32 entries; > - > - u32 cqn; > - u32 cqe; > - u64 cq_handle; > - > - struct ib_umem *umem; > - struct ib_umem *db_umem; > - struct mlx4_ib_user_db_page user_db_page; > - struct hvnd_ucontext *uctx; > - struct hvnd_ep_obj ep_object; //KYS need to clean this up; have a cq irp state > - bool monitor; > - bool upcall_pending; > -}; > - > -struct hvnd_qp { > - struct ib_qp ibqp; > - void *qp_buf; > - void *db_addr; > - u32 buf_size; > - u8 port; > - struct hvnd_dev *nd_dev; > - > - __u8 log_sq_bb_count; > - __u8 log_sq_stride; > - __u8 sq_no_prefetch; > - > - int rq_wqe_cnt; > - int rq_wqe_shift; > - int rq_max_gs; > - > - int sq_wqe_cnt; > - int sq_wqe_shift; > - int sq_max_gs; > - > - u32 max_inline_data; > - > - u32 initiator_q_depth; > - u32 initiator_request_sge; > - > - u32 receive_q_depth; > - u32 receive_request_sge; > - > - struct hvnd_cq *recv_cq; > - struct hvnd_cq *send_cq; > - > - u64 receive_cq_handle; > - u64 initiator_cq_handle; > - u64 pd_handle; > - > - u64 qp_handle; > - u32 qpn; > - u32 max_send_wr; > - u32 max_recv_wr; > - u32 max_send_sge; > - u32 max_recv_sge; > - > - struct ib_umem *umem; > - struct ib_umem *db_umem; > - struct mlx4_ib_user_db_page user_db_page; > - struct hvnd_ucontext *uctx; > - struct iw_cm_id *cm_id; > - > - /* > - * Current QP state; need to look at locking. > - * XXXKYS > - */ > - enum ib_qp_state qp_state; > - bool cq_notify; > - wait_queue_head_t wait; > - atomic_t refcnt; > - struct hvnd_ep_obj *connector; > -}; > - > -struct hvnd_mr { > - struct ib_mr ibmr; > - struct hvnd_ib_pd *pd; > - struct ib_umem *umem; > - u64 start; > - u64 length; > - u64 virt; > - int acc; > - u64 mr_handle; > - u32 mr_lkey; > - u32 mr_rkey; > -}; > - > -struct hvnd_cookie { > - struct completion host_event; > - void *pkt; > -}; > - > -/* > - * Definitions to retrieve the IP address. > - */ > - > -#define HVND_CURRENT_VERSION 0 > - > -struct hvnd_ipaddr_tuple { > - char mac_address[ETH_ALEN]; > - struct sockaddr addr; > -}; > - > -struct hvnd_msg { > - int status; > - struct hvnd_ipaddr_tuple ip_tuple; > -}; > - > -static inline struct hvnd_ib_pd *to_nd_pd(struct ib_pd *pd) > -{ > - return container_of(pd, struct hvnd_ib_pd, ibpd); > -} > - > -static inline struct hvnd_dev *to_nd_dev(struct ib_device *ibdev) > -{ > - return container_of(ibdev, struct hvnd_dev, ibdev); > -} > - > -static inline struct hvnd_cq *to_nd_cq(struct ib_cq *ibcq) > -{ > - return container_of(ibcq, struct hvnd_cq, ibcq); > -} > - > -static inline struct hvnd_qp *to_nd_qp(struct ib_qp *ibqp) > -{ > - return container_of(ibqp, struct hvnd_qp, ibqp); > -} > - > -static inline struct hvnd_ucontext *to_nd_context(struct ib_ucontext *ibucontext) > -{ > - return container_of(ibucontext, struct hvnd_ucontext, ibucontext); > -} > - > -static inline struct hvnd_ucontext *get_uctx_from_pd(struct ib_pd *pd) > -{ > - return to_nd_context(pd->uobject->context); > -} > - > -static inline struct hvnd_mr *to_nd_mr(struct ib_mr *ibmr) > -{ > - return container_of(ibmr, struct hvnd_mr, ibmr); > -} > -/* > - * ID management. > - */ > - > -static inline int insert_handle(struct hvnd_dev *dev, struct idr *idr, > - void *handle, u32 id) > -{ > - int ret; > - unsigned long flags; > - > - idr_preload(GFP_KERNEL); > - spin_lock_irqsave(&dev->id_lock, flags); > - > - ret = idr_alloc(idr, handle, id, id + 1, GFP_ATOMIC); > - > - spin_unlock_irqrestore(&dev->id_lock, flags); > - idr_preload_end(); > - > - BUG_ON(ret == -ENOSPC); > - return ret < 0 ? ret : 0; > -} > - > -static inline void remove_handle(struct hvnd_dev *dev, struct idr *idr, u32 id) > -{ > - unsigned long flags; > - > - spin_lock_irqsave(&dev->id_lock, flags); > - idr_remove(idr, id); > - spin_unlock_irqrestore(&dev->id_lock, flags); > -} > - > -static inline struct hvnd_cq *get_cqp(struct hvnd_dev *dev, u32 cqid) > -{ > - struct hvnd_cq *cqp; > - unsigned long flags; > - > - spin_lock_irqsave(&dev->id_lock, flags); > - cqp = idr_find(&dev->cqidr, cqid); > - spin_unlock_irqrestore(&dev->id_lock, flags); > - > - return cqp; > -} > - > -static inline struct hvnd_qp *get_qpp(struct hvnd_dev *dev, u32 qpid) > -{ > - struct hvnd_qp *qpp; > - unsigned long flags; > - > - spin_lock_irqsave(&dev->id_lock, flags); > - qpp = idr_find(&dev->qpidr, qpid); > - spin_unlock_irqrestore(&dev->id_lock, flags); > - > - return qpp; > -} > - > -static inline struct hvnd_ucontext *get_uctx(struct hvnd_dev *dev, u32 pid) > -{ > - struct hvnd_ucontext *uctx; > - unsigned long flags; > - > - spin_lock_irqsave(&dev->id_lock, flags); > - uctx = idr_find(&dev->uctxidr, pid); > - spin_unlock_irqrestore(&dev->id_lock, flags); > - > - return uctx; > -} > - > - > -static inline void *map_irp_to_ctx(struct hvnd_dev *nd_dev, u32 irp) > -{ > - void *ctx; > - unsigned long flags; > - > - spin_lock_irqsave(&nd_dev->id_lock, flags); > - ctx = idr_find(&nd_dev->irpidr, irp); > - spin_unlock_irqrestore(&nd_dev->id_lock, flags); > - > - return ctx; > -} > - > - > - > -void hvnd_callback(void *context); > -int hvnd_negotiate_version(struct hvnd_dev *nd_dev); > -int hvnd_init_resources(struct hvnd_dev *nd_dev); > -int hvnd_bind_nic(struct hvnd_dev *nd_dev, bool un_bind, char *ip_addr, char *mac_addr); > -int hvnd_open_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx); > -int hvnd_close_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx); > -int hvnd_query_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx); > -int hvnd_create_pd(struct hvnd_ucontext *uctx, struct hvnd_dev *nd_dev, > - struct hvnd_ib_pd *hvnd_pd); > - > -/* > - * CQ operations. > - */ > -int hvnd_create_cq(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - struct hvnd_cq *cq); > -int hvnd_destroy_cq(struct hvnd_dev *nd_dev, struct hvnd_cq *cq); > -int hvnd_notify_cq(struct hvnd_dev *nd_dev, struct hvnd_cq *cq, > - u32 notify_type, u64 irp_handle); > - > -/* > - * QP operations. > - */ > -int hvnd_create_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - struct hvnd_qp *qp); > - > -int hvnd_free_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - struct hvnd_qp *qp); > - > -int hvnd_flush_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - struct hvnd_qp *qp); > - > -/* > - * MR operations. > - */ > - > -int hvnd_cr_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 pd_handle, u64 *mr_handle); > - > -int hvnd_free_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 handle); > - > -int hvnd_mr_register(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - struct hvnd_mr *mr); > -int hvnd_deregister_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 handle); > - > -/* > - * Listner operations > - */ > -int hvnd_cr_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, u64 *handle); > - > - > -int hvnd_free_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 listener_handle); > - > -int hvnd_bind_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 listener_handle, union nd_sockaddr_inet *addr); > - > -int hvnd_listen_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 listener_handle, u32 backlog); > - > -int hvnd_get_addr_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 listener_handle, union nd_sockaddr_inet *addr); > - > -int hvnd_get_connection_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 listener_handle, u64 connector_handle, > - u64 irp_handle); > - > -/* > - * Connector operations. > - */ > -int hvnd_cr_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 *connector_handle); > - > -int hvnd_free_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 handle); > - > -int hvnd_cancelio_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 handle); > -int hvnd_bind_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 handle, union nd_sockaddr_inet *addr); > - > -int hvnd_connector_connect(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 connector_handle, u32 in_rd_limit, u32 out_rd_limit, > - u32 priv_data_length, const u8 *priv_data, > - u64 qp_handle, struct if_physical_addr *phys_addr, > - union nd_sockaddr_inet *dest_addr, struct hvnd_ep_obj *ep); > - > -int hvnd_connector_complete_connect(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 connector_handle, enum ibv_qp_state *qp_state); > - > -int hvnd_connector_accept(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 connector_handle, > - u64 qp_handle, > - u32 in_rd_limit, u32 out_rd_limit, > - u32 priv_data_length, const u8 *priv_data, > - enum ibv_qp_state *qp_state, struct hvnd_ep_obj *ep); > - > -int hvnd_connector_reject(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 connector_handle, > - u32 priv_data_length, u8 *priv_data, > - enum ibv_qp_state *qp_state); > - > -int hvnd_connector_get_rd_limits(struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx, > - u64 connector_handle, > - struct nd_read_limits *rd_limits); > - > -int hvnd_connector_get_priv_data(struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx, > - u64 connector_handle, > - u8 *priv_data); > - > -int hvnd_connector_get_peer_addr(struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx, > - u64 connector_handle, > - union nd_sockaddr_inet *peer_addr); > - > -int hvnd_connector_get_local_addr(struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx, > - u64 connector_handle, > - union nd_sockaddr_inet *local_addr); > - > -int hvnd_connector_notify_disconnect(struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx, > - u64 connector_handle, struct hvnd_ep_obj *ep); > - > - > -int hvnd_connector_disconnect(struct hvnd_dev *nd_dev, > - struct hvnd_ucontext *uctx, > - u64 connector_handle, struct hvnd_ep_obj *ep); > - > -int hvnd_free_handle(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - u64 handle, u32 ioctl); > - > -int hvnd_cancel_io(struct hvnd_ep_obj *ep_object); > - > -char *hvnd_get_op_name(int ioctl); > - > -void hvnd_acquire_uctx_ref(struct hvnd_ucontext *uctx); > -void hvnd_drop_uctx_ref(struct hvnd_dev *nd_dev,struct hvnd_ucontext *uctx); > -void hvnd_process_events(struct work_struct *work); > - > -void hvnd_process_cq_event_pending(struct hvnd_ep_obj *ep, int status); > -void hvnd_process_cq_event_complete(struct hvnd_ep_obj *ep, int status); > -void hvnd_process_connector_accept(struct hvnd_ep_obj *ep_object, int status); > -void hvnd_process_notify_disconnect(struct hvnd_ep_obj *ep_object, int status); > -void hvnd_process_disconnect(struct hvnd_ep_obj *ep_object, int status); > - > -void put_irp_handle(struct hvnd_dev *nd_dev, u32 irp); > -int get_irp_handle(struct hvnd_dev *nd_dev, u32 *local, void *irp_ctx); > - > -void hvnd_init_hdr(struct ndv_packet_hdr_control_1 *hdr, > - u32 data_sz, u32 local, u32 remote, > - u32 ioctl_code, > - u32 ext_data_sz, u32 ext_data_offset, > - u64 irp_handle); > - > -int hvnd_send_ioctl_pkt(struct hvnd_dev *nd_dev, > - struct ndv_packet_hdr_control_1 *hdr, > - u32 pkt_size, u64 cookie); > - > -int hvnd_get_outgoing_rdma_addr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, > - union nd_sockaddr_inet *og_addr); > - > -int hvnd_get_neigh_mac_addr(struct sockaddr *local, struct sockaddr *remote, char *mac_addr); > - > -void hvnd_addr_init(void); > - > -void hvnd_addr_deinit(void); > - > -bool ep_add_work_pending(struct hvnd_ep_obj *ep_object); > -void ep_del_work_pending(struct hvnd_ep_obj *ep_object); > -void ep_stop(struct hvnd_ep_obj *ep_object); > - > -#define current_pid() (current->pid) > -/* > - * NT STATUS defines. > - */ > - > -#define STATUS_SUCCESS 0x0 > -#define STATUS_PENDING 0x00000103 > -#define STATUS_CANCELLED 0xC0000120 > -#define STATUS_DISCONNECTED 0xC000020C > -#define STATUS_TIMEOUT 0xC00000B5 > - > -void inc_ioctl_counter_request(unsigned ioctl); > -void inc_ioctl_counter_response(unsigned ioctl); > - > -#define NDV_PROTOCOL_VAERSION_INVALID -1 > -#define NDV_PACKET_INIT_SIZE 16 /* Size of the INIT packet */ > - > -#define HVND_RING_SZ (PAGE_SIZE * 64) > - > -/* logging levels */ > -#define HVND_ERROR 0 > -#define HVND_WARN 1 > -#define HVND_INFO 2 > -#define HVND_DEBUG 3 > - > -extern int hvnd_log_level; > - > -#define hvnd_error(fmt, args...) hvnd_log(HVND_ERROR, fmt, ##args) > -#define hvnd_warn(fmt, args...) hvnd_log(HVND_WARN, fmt, ##args) > -#define hvnd_info(fmt, args...) hvnd_log(HVND_INFO, fmt, ##args) > -#define hvnd_debug(fmt, args...) hvnd_log(HVND_DEBUG, fmt, ##args) > - > -#define hvnd_log(level, fmt, args...) \ > -do { \ > - if (unlikely(hvnd_log_level >= (level))) \ > - printk(KERN_ERR "hvnd %s[%u]: " fmt, __func__, __LINE__, ##args); \ > -} while (0) > - > -#endif /* _VMBUS_RDMA_H */ >
Acked-by: Benjamin M Romer <benjamin.romer@canonical.com>
Acked-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com>
diff --git a/drivers/infiniband/hw/vmbus-rdma/Makefile b/drivers/infiniband/hw/vmbus-rdma/Makefile index b6e900002e2f..9a0cb3cff61b 100644 --- a/drivers/infiniband/hw/vmbus-rdma/Makefile +++ b/drivers/infiniband/hw/vmbus-rdma/Makefile @@ -1,2 +1,2 @@ -obj-$(CONFIG_HYPERV_INFINIBAND_ND) += vmbus-rdma-142/ -obj-$(CONFIG_HYPERV_INFINIBAND_ND) += vmbus-rdma-144/ +obj-$(CONFIG_HYPERV_INFINIBAND_ND) += vmbus-rdma-142.0/ +obj-$(CONFIG_HYPERV_INFINIBAND_ND) += vmbus-rdma-144.0/ diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/COPYING b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/COPYING new file mode 100644 index 000000000000..f0960c946b07 --- /dev/null +++ b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/COPYING @@ -0,0 +1,339 @@ + +"This software program is licensed subject to the GNU General Public License +(GPL). Version 2, June 1991, available at +<http://www.fsf.org/copyleft/gpl.html>" + +GNU General Public License + +Version 2, June 1991 + +Copyright (C) 1989, 1991 Free Software Foundation, Inc. +51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + +Everyone is permitted to copy and distribute verbatim copies of this license +document, but changing it is not allowed. + +Preamble + +The licenses for most software are designed to take away your freedom to +share and change it. By contrast, the GNU General Public License is intended +to guarantee your freedom to share and change free software--to make sure +the software is free for all its users. This General Public License applies +to most of the Free Software Foundation's software and to any other program +whose authors commit to using it. (Some other Free Software Foundation +software is covered by the GNU Library General Public License instead.) You +can apply it to your programs, too. + +When we speak of free software, we are referring to freedom, not price. Our +General Public Licenses are designed to make sure that you have the freedom +to distribute copies of free software (and charge for this service if you +wish), that you receive source code or can get it if you want it, that you +can change the software or use pieces of it in new free programs; and that +you know you can do these things. + +To protect your rights, we need to make restrictions that forbid anyone to +deny you these rights or to ask you to surrender the rights. These +restrictions translate to certain responsibilities for you if you distribute +copies of the software, or if you modify it. + +For example, if you distribute copies of such a program, whether gratis or +for a fee, you must give the recipients all the rights that you have. You +must make sure that they, too, receive or can get the source code. And you +must show them these terms so they know their rights. + +We protect your rights with two steps: (1) copyright the software, and (2) +offer you this license which gives you legal permission to copy, distribute +and/or modify the software. + +Also, for each author's protection and ours, we want to make certain that +everyone understands that there is no warranty for this free software. If +the software is modified by someone else and passed on, we want its +recipients to know that what they have is not the original, so that any +problems introduced by others will not reflect on the original authors' +reputations. + +Finally, any free program is threatened constantly by software patents. We +wish to avoid the danger that redistributors of a free program will +individually obtain patent licenses, in effect making the program +proprietary. To prevent this, we have made it clear that any patent must be +licensed for everyone's free use or not licensed at all. + +The precise terms and conditions for copying, distribution and modification +follow. + +TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION + +0. This License applies to any program or other work which contains a notice + placed by the copyright holder saying it may be distributed under the + terms of this General Public License. The "Program", below, refers to any + such program or work, and a "work based on the Program" means either the + Program or any derivative work under copyright law: that is to say, a + work containing the Program or a portion of it, either verbatim or with + modifications and/or translated into another language. (Hereinafter, + translation is included without limitation in the term "modification".) + Each licensee is addressed as "you". + + Activities other than copying, distribution and modification are not + covered by this License; they are outside its scope. The act of running + the Program is not restricted, and the output from the Program is covered + only if its contents constitute a work based on the Program (independent + of having been made by running the Program). Whether that is true depends + on what the Program does. + +1. You may copy and distribute verbatim copies of the Program's source code + as you receive it, in any medium, provided that you conspicuously and + appropriately publish on each copy an appropriate copyright notice and + disclaimer of warranty; keep intact all the notices that refer to this + License and to the absence of any warranty; and give any other recipients + of the Program a copy of this License along with the Program. + + You may charge a fee for the physical act of transferring a copy, and you + may at your option offer warranty protection in exchange for a fee. + +2. You may modify your copy or copies of the Program or any portion of it, + thus forming a work based on the Program, and copy and distribute such + modifications or work under the terms of Section 1 above, provided that + you also meet all of these conditions: + + * a) You must cause the modified files to carry prominent notices stating + that you changed the files and the date of any change. + + * b) You must cause any work that you distribute or publish, that in + whole or in part contains or is derived from the Program or any part + thereof, to be licensed as a whole at no charge to all third parties + under the terms of this License. + + * c) If the modified program normally reads commands interactively when + run, you must cause it, when started running for such interactive + use in the most ordinary way, to print or display an announcement + including an appropriate copyright notice and a notice that there is + no warranty (or else, saying that you provide a warranty) and that + users may redistribute the program under these conditions, and + telling the user how to view a copy of this License. (Exception: if + the Program itself is interactive but does not normally print such + an announcement, your work based on the Program is not required to + print an announcement.) + + These requirements apply to the modified work as a whole. If identifiable + sections of that work are not derived from the Program, and can be + reasonably considered independent and separate works in themselves, then + this License, and its terms, do not apply to those sections when you + distribute them as separate works. But when you distribute the same + sections as part of a whole which is a work based on the Program, the + distribution of the whole must be on the terms of this License, whose + permissions for other licensees extend to the entire whole, and thus to + each and every part regardless of who wrote it. + + Thus, it is not the intent of this section to claim rights or contest + your rights to work written entirely by you; rather, the intent is to + exercise the right to control the distribution of derivative or + collective works based on the Program. + + In addition, mere aggregation of another work not based on the Program + with the Program (or with a work based on the Program) on a volume of a + storage or distribution medium does not bring the other work under the + scope of this License. + +3. You may copy and distribute the Program (or a work based on it, under + Section 2) in object code or executable form under the terms of Sections + 1 and 2 above provided that you also do one of the following: + + * a) Accompany it with the complete corresponding machine-readable source + code, which must be distributed under the terms of Sections 1 and 2 + above on a medium customarily used for software interchange; or, + + * b) Accompany it with a written offer, valid for at least three years, + to give any third party, for a charge no more than your cost of + physically performing source distribution, a complete machine- + readable copy of the corresponding source code, to be distributed + under the terms of Sections 1 and 2 above on a medium customarily + used for software interchange; or, + + * c) Accompany it with the information you received as to the offer to + distribute corresponding source code. (This alternative is allowed + only for noncommercial distribution and only if you received the + program in object code or executable form with such an offer, in + accord with Subsection b above.) + + The source code for a work means the preferred form of the work for + making modifications to it. For an executable work, complete source code + means all the source code for all modules it contains, plus any + associated interface definition files, plus the scripts used to control + compilation and installation of the executable. However, as a special + exception, the source code distributed need not include anything that is + normally distributed (in either source or binary form) with the major + components (compiler, kernel, and so on) of the operating system on which + the executable runs, unless that component itself accompanies the + executable. + + If distribution of executable or object code is made by offering access + to copy from a designated place, then offering equivalent access to copy + the source code from the same place counts as distribution of the source + code, even though third parties are not compelled to copy the source + along with the object code. + +4. You may not copy, modify, sublicense, or distribute the Program except as + expressly provided under this License. Any attempt otherwise to copy, + modify, sublicense or distribute the Program is void, and will + automatically terminate your rights under this License. However, parties + who have received copies, or rights, from you under this License will not + have their licenses terminated so long as such parties remain in full + compliance. + +5. You are not required to accept this License, since you have not signed + it. However, nothing else grants you permission to modify or distribute + the Program or its derivative works. These actions are prohibited by law + if you do not accept this License. Therefore, by modifying or + distributing the Program (or any work based on the Program), you + indicate your acceptance of this License to do so, and all its terms and + conditions for copying, distributing or modifying the Program or works + based on it. + +6. Each time you redistribute the Program (or any work based on the + Program), the recipient automatically receives a license from the + original licensor to copy, distribute or modify the Program subject to + these terms and conditions. You may not impose any further restrictions + on the recipients' exercise of the rights granted herein. You are not + responsible for enforcing compliance by third parties to this License. + +7. If, as a consequence of a court judgment or allegation of patent + infringement or for any other reason (not limited to patent issues), + conditions are imposed on you (whether by court order, agreement or + otherwise) that contradict the conditions of this License, they do not + excuse you from the conditions of this License. If you cannot distribute + so as to satisfy simultaneously your obligations under this License and + any other pertinent obligations, then as a consequence you may not + distribute the Program at all. For example, if a patent license would + not permit royalty-free redistribution of the Program by all those who + receive copies directly or indirectly through you, then the only way you + could satisfy both it and this License would be to refrain entirely from + distribution of the Program. + + If any portion of this section is held invalid or unenforceable under any + particular circumstance, the balance of the section is intended to apply + and the section as a whole is intended to apply in other circumstances. + + It is not the purpose of this section to induce you to infringe any + patents or other property right claims or to contest validity of any + such claims; this section has the sole purpose of protecting the + integrity of the free software distribution system, which is implemented + by public license practices. Many people have made generous contributions + to the wide range of software distributed through that system in + reliance on consistent application of that system; it is up to the + author/donor to decide if he or she is willing to distribute software + through any other system and a licensee cannot impose that choice. + + This section is intended to make thoroughly clear what is believed to be + a consequence of the rest of this License. + +8. If the distribution and/or use of the Program is restricted in certain + countries either by patents or by copyrighted interfaces, the original + copyright holder who places the Program under this License may add an + explicit geographical distribution limitation excluding those countries, + so that distribution is permitted only in or among countries not thus + excluded. In such case, this License incorporates the limitation as if + written in the body of this License. + +9. The Free Software Foundation may publish revised and/or new versions of + the General Public License from time to time. Such new versions will be + similar in spirit to the present version, but may differ in detail to + address new problems or concerns. + + Each version is given a distinguishing version number. If the Program + specifies a version number of this License which applies to it and "any + later version", you have the option of following the terms and + conditions either of that version or of any later version published by + the Free Software Foundation. If the Program does not specify a version + number of this License, you may choose any version ever published by the + Free Software Foundation. + +10. If you wish to incorporate parts of the Program into other free programs + whose distribution conditions are different, write to the author to ask + for permission. For software which is copyrighted by the Free Software + Foundation, write to the Free Software Foundation; we sometimes make + exceptions for this. Our decision will be guided by the two goals of + preserving the free status of all derivatives of our free software and + of promoting the sharing and reuse of software generally. + + NO WARRANTY + +11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY + FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN + OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES + PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER + EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED + WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE + ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH + YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL + NECESSARY SERVICING, REPAIR OR CORRECTION. + +12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING + WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR + REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR + DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL + DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM + (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED + INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF + THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR + OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. + +END OF TERMS AND CONDITIONS + +How to Apply These Terms to Your New Programs + +If you develop a new program, and you want it to be of the greatest +possible use to the public, the best way to achieve this is to make it free +software which everyone can redistribute and change under these terms. + +To do so, attach the following notices to the program. It is safest to +attach them to the start of each source file to most effectively convey the +exclusion of warranty; and each file should have at least the "copyright" +line and a pointer to where the full notice is found. + +one line to give the program's name and an idea of what it does. +Copyright (C) yyyy name of author + +This program is free software; you can redistribute it and/or modify it +under the terms of the GNU General Public License as published by the Free +Software Foundation; either version 2 of the License, or (at your option) +any later version. + +This program is distributed in the hope that it will be useful, but WITHOUT +ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for +more details. + +You should have received a copy of the GNU General Public License along with +this program; if not, write to the Free Software Foundation, Inc., 59 +Temple Place - Suite 330, Boston, MA 02111-1307, USA. + +Also add information on how to contact you by electronic and paper mail. + +If the program is interactive, make it output a short notice like this when +it starts in an interactive mode: + +Gnomovision version 69, Copyright (C) year name of author Gnomovision comes +with ABSOLUTELY NO WARRANTY; for details type 'show w'. This is free +software, and you are welcome to redistribute it under certain conditions; +type 'show c' for details. + +The hypothetical commands 'show w' and 'show c' should show the appropriate +parts of the General Public License. Of course, the commands you use may be +called something other than 'show w' and 'show c'; they could even be +mouse-clicks or menu items--whatever suits your program. + +You should also get your employer (if you work as a programmer) or your +school, if any, to sign a "copyright disclaimer" for the program, if +necessary. Here is a sample; alter the names: + +Yoyodyne, Inc., hereby disclaims all copyright interest in the program +'Gnomovision' (which makes passes at compilers) written by James Hacker. + +signature of Ty Coon, 1 April 1989 +Ty Coon, President of Vice + +This General Public License does not permit incorporating your program into +proprietary programs. If your program is a subroutine library, you may +consider it more useful to permit linking proprietary applications with the +library. If this is what you want to do, use the GNU Library General Public +License instead of this License. diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/Makefile b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/Makefile new file mode 100644 index 000000000000..8fc23718bcf1 --- /dev/null +++ b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/Makefile @@ -0,0 +1,3 @@ +obj-$(CONFIG_HYPERV_INFINIBAND_ND) += hv_network_direct_142.0.o + +hv_network_direct_142.0-y := provider.o vmbus_rdma.o hvnd_addr.o diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/hvnd_addr.c b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/hvnd_addr.c new file mode 100644 index 000000000000..c8c2295b63af --- /dev/null +++ b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/hvnd_addr.c @@ -0,0 +1,136 @@ +/* + * Copyright (c) 2014, Microsoft Corporation. + * + * Author: + * K. Y. Srinivasan <kys@microsoft.com> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or + * NON INFRINGEMENT. See the GNU General Public License for more + * details. + * + * Bug fixes/enhancements: Long Li <longli@microsoft.com> + */ + +#include <linux/completion.h> +#include <linux/module.h> +#include <linux/errno.h> +#include <linux/hyperv.h> +#include <linux/efi.h> +#include <linux/slab.h> +#include <linux/cred.h> +#include <linux/sched.h> +#include <linux/types.h> +#include <linux/completion.h> +#include <linux/scatterlist.h> +#include <rdma/ib_umem.h> +#include <rdma/ib_addr.h> + +#include "vmbus_rdma.h" + + +#include <linux/semaphore.h> +#include <linux/fs.h> +#include <linux/nls.h> +#include <linux/workqueue.h> +#include <linux/cdev.h> +#include <linux/hyperv.h> +#include <linux/sched.h> +#include <linux/uaccess.h> +#include <linux/miscdevice.h> +#include <linux/hyperv.h> + + +int hvnd_get_outgoing_rdma_addr(struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx, + union nd_sockaddr_inet *og_addr) +{ + int ret; + /* + * Query the host and select the first address. + */ + struct pkt_query_addr_list pkt; + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + + hvnd_init_hdr(&pkt.hdr, + (sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1)), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST, 0, 0, 0); + + pkt.ioctl.in.version = ND_VERSION_1; + pkt.ioctl.in.reserved = 0; + pkt.ioctl.in.handle = uctx->adaptor_hdl; + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, + sizeof(pkt), (u64)&pkt); + + if (ret) + return ret; + + /* + * Copy the address out. + */ + + memcpy(og_addr, &pkt.ioctl.out[0], sizeof(*og_addr)); + return 0; + +} + +static struct rdma_addr_client self; + +struct resolve_cb_context { + struct rdma_dev_addr *addr; + struct completion comp; +}; + +void hvnd_addr_init(void) +{ + rdma_addr_register_client(&self); +} + +void hvnd_addr_deinit(void) +{ + rdma_addr_unregister_client(&self); +} + +static void resolve_cb(int status, struct sockaddr *src_addr, + struct rdma_dev_addr *addr, void *context) +{ + memcpy(((struct resolve_cb_context *)context)->addr, addr, sizeof(struct + rdma_dev_addr)); + complete(&((struct resolve_cb_context *)context)->comp); +} + +int hvnd_get_neigh_mac_addr(struct sockaddr *local, struct sockaddr *remote, + char *mac_addr) +{ + struct rdma_dev_addr dev_addr; + struct resolve_cb_context ctx; + int ret; + + memset(&dev_addr, 0, sizeof(dev_addr)); + dev_addr.net = &init_net; + ctx.addr = &dev_addr; + init_completion(&ctx.comp); + + ret = rdma_resolve_ip(&self, local, remote, &dev_addr, 1000, + resolve_cb, &ctx); + + if (ret) { + hvnd_error("rdma_resolve_ip failed ret=%d\n", ret); + return ret; + } + + wait_for_completion(&ctx.comp); + memcpy(mac_addr, dev_addr.dst_dev_addr, ETH_ALEN); + return ret; +} diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/mx_abi.h b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/mx_abi.h new file mode 100644 index 000000000000..8f55cb408de0 --- /dev/null +++ b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/mx_abi.h @@ -0,0 +1,232 @@ +/* + * Copyright (c) 2005 Topspin Communications. All rights reserved. + * Copyright (c) 2005 Cisco Systems. All rights reserved. + * Copyright (c) 2005 PathScale, Inc. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * KYS: made some modifications. + */ + +#ifndef MX_ABI_H +#define MX_ABI_H + + + +/* + * Make sure that all structs defined in this file remain laid out so + * that they pack the same way on 32-bit and 64-bit architectures (to + * avoid incompatibility between 32-bit userspace and 64-bit kernels). + * Specifically: + * - Do not use pointer types -- pass pointers in UINT64 instead. + * - Make sure that any structure larger than 4 bytes is padded to a + * multiple of 8 bytes. Otherwise the structure size will be + * different between 32-bit and 64-bit architectures. + */ + +enum ibv_get_context_mappings { + IBV_GET_CONTEXT_UAR, + IBV_GET_CONTEXT_BF, + IBV_GET_CONTEXT_MAPPING_MAX +}; + +struct ibv_get_context_req { + + union nd_mapping mappings[IBV_GET_CONTEXT_MAPPING_MAX]; +}; + +struct ibv_get_context_resp { + + // mmap UAR and BF + struct nd_mapping_result mapping_results[IBV_GET_CONTEXT_MAPPING_MAX]; + + // mmap Blue Flame + int bf_buf_size; + int bf_offset; + + // mlx4_query_device result + int max_qp_wr; + int max_sge; + int max_cqe; + + // general parameters + u32 cqe_size; + u32 vend_id; + u16 dev_id; + u16 bf_reg_size; + u16 bf_regs_per_page; + u16 reserved1; + + // ibv_cmd_get_context result + u32 qp_tab_size; + + u32 reserved2; +}; + +struct ibv_alloc_pd_resp { + u64 pd_handle; + u32 pdn; + u32 reserved; +}; + +struct ibv_reg_mr { + u64 start; + u64 length; + u64 hca_va; + u32 access_flags; + u32 pdn; + u64 pd_handle; +}; + +struct ibv_reg_mr_resp { + u64 mr_handle; + u32 lkey; + u32 rkey; +}; + + +enum mlx4_ib_create_cq_mapping { + MLX4_IB_CREATE_CQ_BUF, + MLX4_IB_CREATE_CQ_DB, + MLX4_IB_CREATE_CQ_ARM_SN, // Windows specific + MLX4_IB_CREATE_CQ_MAPPING_MAX +}; + +#define MLX4_CQ_FLAGS_ARM_IN_KERNEL 1 + +struct ibv_create_cq { + union nd_mapping mappings[MLX4_IB_CREATE_CQ_MAPPING_MAX]; + u32 flags; +}; + +struct ibv_create_cq_resp { + struct nd_mapping_result mapping_results[MLX4_IB_CREATE_CQ_MAPPING_MAX]; + u32 cqn; + u32 cqe; +}; + +enum mlx4_ib_create_srq_mappings { + MLX4_IB_CREATE_SRQ_BUF, + MLX4_IB_CREATE_SRQ_DB, + MLX4_IB_CREATE_SRQ_MAPPINGS_MAX +}; + +struct ibv_create_srq { + union nd_mapping mappings[MLX4_IB_CREATE_SRQ_MAPPINGS_MAX]; +}; + +struct ibv_create_srq_resp { + struct nd_mapping_result mapping_results[MLX4_IB_CREATE_SRQ_MAPPINGS_MAX]; +}; + +enum mlx4_ib_create_qp_mappings { + MLX4_IB_CREATE_QP_BUF, + MLX4_IB_CREATE_QP_DB, + MLX4_IB_CREATE_QP_MAPPINGS_MAX +}; + +struct ibv_create_qp { + union nd_mapping mappings[MLX4_IB_CREATE_QP_MAPPINGS_MAX]; + u8 log_sq_bb_count; + u8 log_sq_stride; + u8 sq_no_prefetch; + u8 reserved; +}; + +struct ibv_create_qp_resp { + struct nd_mapping_result mapping_results[MLX4_IB_CREATE_QP_MAPPINGS_MAX]; + // struct ib_uverbs_create_qp_resp + u64 qp_handle; + u32 qpn; + u32 max_send_wr; + u32 max_recv_wr; + u32 max_send_sge; + u32 max_recv_sge; + u32 max_inline_data; +}; + +enum ibv_qp_attr_mask { + IBV_QP_STATE = 1 << 0, + IBV_QP_CUR_STATE = 1 << 1, + IBV_QP_EN_SQD_ASYNC_NOTIFY = 1 << 2, + IBV_QP_ACCESS_FLAGS = 1 << 3, + IBV_QP_PKEY_INDEX = 1 << 4, + IBV_QP_PORT = 1 << 5, + IBV_QP_QKEY = 1 << 6, + IBV_QP_AV = 1 << 7, + IBV_QP_PATH_MTU = 1 << 8, + IBV_QP_TIMEOUT = 1 << 9, + IBV_QP_RETRY_CNT = 1 << 10, + IBV_QP_RNR_RETRY = 1 << 11, + IBV_QP_RQ_PSN = 1 << 12, + IBV_QP_MAX_QP_RD_ATOMIC = 1 << 13, + IBV_QP_ALT_PATH = 1 << 14, + IBV_QP_MIN_RNR_TIMER = 1 << 15, + IBV_QP_SQ_PSN = 1 << 16, + IBV_QP_MAX_DEST_RD_ATOMIC = 1 << 17, + IBV_QP_PATH_MIG_STATE = 1 << 18, + IBV_QP_CAP = 1 << 19, + IBV_QP_DEST_QPN = 1 << 20 +}; + +enum ibv_qp_state { + IBV_QPS_RESET, + IBV_QPS_INIT, + IBV_QPS_RTR, + IBV_QPS_RTS, + IBV_QPS_SQD, + IBV_QPS_SQE, + IBV_QPS_ERR +}; + + +struct ibv_modify_qp_resp { + enum ibv_qp_attr_mask attr_mask; + u8 qp_state; + u8 reserved[3]; +}; + +struct ibv_create_ah_resp { + u64 start; +}; + +/* + * Some mlx4 specific kernel definitions. Perhaps could be in + * separate file. + */ + +struct mlx4_ib_user_db_page { + struct list_head list; + struct ib_umem *umem; + unsigned long user_virt; + int refcnt; +}; + + +#endif /* MX_ABI_H */ diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/provider.c b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/provider.c new file mode 100644 index 000000000000..7115cf769fe0 --- /dev/null +++ b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/provider.c @@ -0,0 +1,2961 @@ +/* + * Copyright (c) 2014, Microsoft Corporation. + * + * Author: + * K. Y. Srinivasan <kys@microsoft.com> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or + * NON INFRINGEMENT. See the GNU General Public License for more + * details. + * + * Bug fixes/enhancements: Long Li <longli@microsoft.com> + */ + +#include <linux/module.h> +#include <linux/moduleparam.h> +#include <linux/device.h> +#include <linux/netdevice.h> +#include <linux/etherdevice.h> +#include <linux/delay.h> +#include <linux/errno.h> +#include <linux/list.h> +#include <linux/spinlock.h> +#include <linux/ethtool.h> +#include <linux/rtnetlink.h> +#include <linux/inetdevice.h> +#include <linux/io.h> +#include <linux/hyperv.h> +#include <linux/completion.h> +#include <rdma/iw_cm.h> +#include <rdma/ib_verbs.h> +#include <rdma/ib_smi.h> +#include <rdma/ib_umem.h> +#include <rdma/ib_user_verbs.h> +#include <rdma/rdma_cm.h> +#include <linux/proc_fs.h> +#include <linux/seq_file.h> +#include "vmbus_rdma.h" + +/* + * We are emulating mlx4. XXXKYS: May have to FIX. + */ +#include "./user.h" + +static struct hvnd_dev *g_nd_dev; // the one and only one + +int hvnd_log_level = HVND_ERROR; +module_param(hvnd_log_level, int, 0644); +MODULE_PARM_DESC(hvnd_log_level, + "Logging level, 0 - Error (default), 1 - Warning, 2 - Info, 3 - Debug."); + +static int disable_cq_notify = 1; +//static int disable_cq_notify = 0; +module_param(disable_cq_notify, int, 0644); +MODULE_PARM_DESC(disable_cq_notify, + "Disable CQ notification, 0 - Enable, 1 - Disable (default)."); + +enum { + MLX4_USER_DEV_CAP_64B_CQE = 1L << 0 +}; + +#define HVND_NODE_DESC "vmbus-RDMA" + +#undef MLX4_IB_UVERBS_ABI_VERSION +#define MLX4_IB_UVERBS_ABI_VERSION 4 + +struct mlx4_wqe_data_seg { + __be32 byte_count; + __be32 lkey; + __be64 addr; +}; + +/* return value: + * true: ep is running + * false: ep is stopped + */ +bool ep_add_work_pending(struct hvnd_ep_obj *ep_object) +{ + bool ret = true; + + atomic_inc(&ep_object->nr_requests_pending); + if (ep_object->stopping) { + if (atomic_dec_and_test(&ep_object->nr_requests_pending)) + wake_up(&ep_object->wait_pending); + ret = false; + } + return ret; +} + +void ep_del_work_pending(struct hvnd_ep_obj *ep_object) +{ + if (atomic_dec_and_test(&ep_object->nr_requests_pending)) + wake_up(&ep_object->wait_pending); + + if (atomic_read(&ep_object->nr_requests_pending) < 0) { + hvnd_error("ep_object->nr_requests_pending=%d type=%d cm_state=%d\n", + atomic_read(&ep_object->nr_requests_pending), + ep_object->type, ep_object->cm_state); + dump_stack(); + } +} + +void ep_stop(struct hvnd_ep_obj *ep_object) +{ + if (!ep_object->stopping) { + ep_object->stopping = true; + hvnd_cancel_io(ep_object); + } + + if (atomic_read(&ep_object->nr_requests_pending) < 0) { + hvnd_error("IO canceled, ep_object->nr_requests_pending=%d type=%d cm_state=%d\n", + atomic_read(&ep_object->nr_requests_pending), + ep_object->type, ep_object->cm_state); + dump_stack(); + } + + wait_event(ep_object->wait_pending, + !atomic_read(&ep_object->nr_requests_pending)); +} + +static int vmbus_dma_map_sg(struct device *dev, struct scatterlist *sgl, + int nents, enum dma_data_direction direction, + unsigned long attrs) +{ + struct scatterlist *sg; + u64 addr; + int i; + int ret = nents; + + BUG_ON(!valid_dma_direction(direction)); + + for_each_sg(sgl, sg, nents, i) { + addr = (u64) page_address(sg_page(sg)); + /* TODO: handle highmem pages */ + if (!addr) { + ret = 0; + break; + } + sg->dma_address = addr + sg->offset; + sg->dma_length = sg->length; + } + return ret; +} + +static void vmbus_dma_unmap_sg(struct device *dev, + struct scatterlist *sg, int nents, + enum dma_data_direction direction, unsigned long attrs) +{ + BUG_ON(!valid_dma_direction(direction)); +} + + +const struct dma_map_ops vmbus_dma_ops = { + .map_sg = vmbus_dma_map_sg, + .unmap_sg = vmbus_dma_unmap_sg, +}; + +static int hvnd_get_incoming_connections(struct hvnd_ep_obj *listener, + struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx); + +static struct hvnd_ep_obj *hvnd_setup_ep(struct iw_cm_id *cm_id, int ep_type, + struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx); + +static void hvnd_deinit_ep(struct hvnd_ep_obj *ep) +{ + put_irp_handle(ep->nd_dev, ep->local_irp); +} + +static void hvnd_destroy_ep(struct hvnd_ep_obj *ep) +{ + hvnd_debug("canceling work for ep %p\n", ep); + cancel_work_sync(&ep->wrk.work); + hvnd_deinit_ep(ep); + kfree(ep); +} + + +#define UC(b) (((int)b)&0xff) +char *debug_inet_ntoa(struct in_addr in, char *b) +{ + register char *p; + + p = (char *)∈ + (void)snprintf(b, 20, + "%d.%d.%d.%d", UC(p[0]), UC(p[1]), UC(p[2]), UC(p[3])); + return b; +} + +void hvnd_process_events(struct work_struct *work); + +static int hvnd_init_ep(struct hvnd_ep_obj *ep_object, + struct iw_cm_id *cm_id, int ep_type, + struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx) +{ + int ret; + + ep_object->type = ep_type; + ep_object->cm_id = cm_id; + ep_object->nd_dev = nd_dev; + ep_object->uctx = uctx; + + ep_object->parent = NULL; + + ep_object->wrk.callback_arg = ep_object; + INIT_WORK(&ep_object->wrk.work, hvnd_process_events); + INIT_LIST_HEAD(&ep_object->incoming_pkt_list); + spin_lock_init(&ep_object->incoming_pkt_list_lock); + + ep_object->stopping = false; + atomic_set(&ep_object->nr_requests_pending, 0); + init_waitqueue_head(&ep_object->wait_pending); + + ret = get_irp_handle(nd_dev, &ep_object->local_irp, (void *)ep_object); + + if (ret) { + hvnd_error("get_irp_handle() failed: err: %d\n", ret); + return ret; + } + return 0; + +} + +static int set_rq_size(struct hvnd_dev *dev, struct ib_qp_cap *cap, + struct hvnd_qp *qp) +{ + + /* HW requires >= 1 RQ entry with >= 1 gather entry */ + if (!cap->max_recv_wr || !cap->max_recv_sge) + return -EINVAL; + + qp->rq_wqe_cnt = roundup_pow_of_two(max(1U, cap->max_recv_wr)); + qp->rq_max_gs = roundup_pow_of_two(max(1U, cap->max_recv_sge)); + qp->rq_wqe_shift = ilog2( + qp->rq_max_gs * sizeof(struct mlx4_wqe_data_seg)); + + + return 0; +} + +static int set_user_sq_size(struct hvnd_dev *dev, + struct hvnd_qp *qp, + struct mlx4_ib_create_qp *ucmd) +{ + qp->sq_wqe_cnt = 1 << ucmd->log_sq_bb_count; + qp->sq_wqe_shift = ucmd->log_sq_stride; + + qp->buf_size = (qp->rq_wqe_cnt << qp->rq_wqe_shift) + + (qp->sq_wqe_cnt << qp->sq_wqe_shift); + + return 0; +} + +static int hvnd_db_map_user(struct hvnd_ucontext *uctx, unsigned long virt, + struct ib_umem **db_umem) +{ + struct mlx4_ib_user_db_page *page; + int err = 0; + + mutex_lock(&uctx->db_page_mutex); + + list_for_each_entry(page, &uctx->db_page_list, list) + if (page->user_virt == (virt & PAGE_MASK)) + goto found; + + page = kmalloc(sizeof(*page), GFP_KERNEL); + if (!page) { + err = -ENOMEM; + goto out; + } + + page->user_virt = (virt & PAGE_MASK); + page->refcnt = 0; + page->umem = ib_umem_get(&uctx->ibucontext, virt & PAGE_MASK, + PAGE_SIZE, 0, 0); + if (IS_ERR(page->umem)) { + hvnd_error("ib_umem_get failure\n"); + err = PTR_ERR(page->umem); + kfree(page); + goto out; + } + + list_add(&page->list, &uctx->db_page_list); + +found: + ++page->refcnt; +out: + mutex_unlock(&uctx->db_page_mutex); + if (!err) + *db_umem = page->umem; + + return err; +} + +static void hvnd_db_unmap_user(struct hvnd_ucontext *uctx, u64 db_addr) +{ + struct mlx4_ib_user_db_page *page; + + mutex_lock(&uctx->db_page_mutex); + list_for_each_entry(page, &uctx->db_page_list, list) + if (page->user_virt == (db_addr & PAGE_MASK)) + goto found; + +found: + if (!--page->refcnt) { + list_del(&page->list); + ib_umem_release(page->umem); + kfree(page); + } + + mutex_unlock(&uctx->db_page_mutex); +} + + +static void debug_check(const char *func, int line) +{ + hvnd_debug("func is: %s; line is %d\n", func, line); + + if (in_interrupt()) { + hvnd_error("In interrupt func is: %s; line is %d\n", + func, line); + return; + } +} + +static struct ib_ah *hvnd_ah_create(struct ib_pd *pd, + struct ib_ah_attr *ah_attr, + struct ib_udata *udata) +{ + debug_check(__func__, __LINE__); + return ERR_PTR(-ENOSYS); +} + +static int hvnd_ah_destroy(struct ib_ah *ah) +{ + debug_check(__func__, __LINE__); + return -ENOSYS; +} + +static int hvnd_multicast_attach(struct ib_qp *ibqp, union ib_gid *gid, u16 lid) +{ + debug_check(__func__, __LINE__); + return -ENOSYS; +} + +static int hvnd_multicast_detach(struct ib_qp *ibqp, union ib_gid *gid, u16 lid) +{ + debug_check(__func__, __LINE__); + return -ENOSYS; +} + +static int hvnd_process_mad(struct ib_device *ibdev, + int mad_flags, + u8 port_num, + const struct ib_wc *in_wc, + const struct ib_grh *in_grh, + const struct ib_mad_hdr *in_mad, + size_t in_mad_size, + struct ib_mad_hdr *out_mad, + size_t *out_mad_size, + u16 *out_mad_pkey_index) +{ + debug_check(__func__, __LINE__); + return -ENOSYS; +} + +void hvnd_acquire_uctx_ref(struct hvnd_ucontext *uctx) +{ + atomic_inc(&uctx->refcnt); +} + +void hvnd_drop_uctx_ref(struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx) +{ + if (atomic_dec_and_test(&uctx->refcnt)) { + hvnd_debug("uctx ref cnt dropped it is %d\n", + atomic_read(&uctx->refcnt)); + hvnd_debug("About to close adaptor\n"); + hvnd_close_adaptor(nd_dev, uctx); + } else + hvnd_debug("uctx ref cnt dropped it is %d\n", + atomic_read(&uctx->refcnt)); +} + + +static int hvnd_dealloc_ucontext(struct ib_ucontext *context) +{ + struct hvnd_dev *nd_dev; + struct hvnd_ucontext *uctx; + + uctx = to_nd_context(context); + nd_dev = to_nd_dev(context->device); + + hvnd_debug("calling %s\n", __func__); + + hvnd_drop_uctx_ref(nd_dev, uctx); + + return 0; +} + +static struct ib_ucontext *hvnd_alloc_ucontext(struct ib_device *ibdev, + struct ib_udata *udata) +{ + struct hvnd_dev *nd_dev = to_nd_dev(ibdev); + struct hvnd_ucontext *uctx; + struct mlx4_ib_alloc_ucontext_resp resp; + struct ibv_get_context_resp *ctx_output; + int ret; + + if (!nd_dev->ib_active) { + hvnd_error("ib device is not active, try again\n"); + return ERR_PTR(-EAGAIN); + } + + uctx = get_uctx(nd_dev, current_pid()); + if (uctx) { + // it is already opened, just increase its reference count + hvnd_acquire_uctx_ref(uctx); + } else { + + /* + * The Windows host expects the following to be done: + * 1. Successfully send struct ndv_pkt_hdr_create_1 + * 2. INIT PROVIDER + * 3. Open Adapter + * Before we can complete this call. + */ + + uctx = kzalloc(sizeof(struct hvnd_ucontext), GFP_KERNEL); + if (!uctx) + return ERR_PTR(-ENOMEM); + + atomic_set(&uctx->refcnt, 1); + INIT_LIST_HEAD(&uctx->db_page_list); + mutex_init(&uctx->db_page_mutex); + + /* + * Stash away the context with the calling PID. + */ + ret = insert_handle(nd_dev, &nd_dev->uctxidr, uctx, + current_pid()); + if (ret) { + hvnd_error("Uctx ID insertion failed; ret is %d\n", + ret); + goto err1; + } + + hvnd_debug("Opening adaptor pid is %d\n", current_pid()); + + ret = hvnd_open_adaptor(nd_dev, uctx); + if (ret) { + hvnd_error("hvnd_open_adaptor failed ret=%d\n", ret); + goto err1; + } + + } + + /* + * Copy the response out. + */ + ctx_output = &uctx->o_adap_pkt.mappings.ctx_output; + resp.dev_caps = MLX4_USER_DEV_CAP_64B_CQE; + resp.qp_tab_size = ctx_output->qp_tab_size; + resp.bf_reg_size = ctx_output->bf_reg_size; + resp.bf_regs_per_page = ctx_output->bf_regs_per_page; + resp.cqe_size = ctx_output->cqe_size; + + ret = ib_copy_to_udata(udata, &resp, sizeof(resp)); + if (ret) { + hvnd_error("ib_copy_to_udata failed ret=%d\n", ret); + goto err1; + } + + return &uctx->ibucontext; + +err1: + kfree(uctx); + return ERR_PTR(ret); +} + +static int hvnd_mmap(struct ib_ucontext *context, struct vm_area_struct *vma) +{ + struct hvnd_ucontext *uctx = to_nd_context(context); + + if (vma->vm_end - vma->vm_start != PAGE_SIZE) { + hvnd_error("vma not a page size, actual size=%lu\n", + vma->vm_end - vma->vm_start); + return -EINVAL; + } + + if (vma->vm_pgoff == 0) { + vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); + + if (io_remap_pfn_range(vma, vma->vm_start, + (uctx->uar_base >> PAGE_SHIFT), + PAGE_SIZE, vma->vm_page_prot)) { + hvnd_error("io_remap_pfn_range failure\n"); + return -EAGAIN; + } + } else if (vma->vm_pgoff == 1 && uctx->bf_buf_size != 0) { + vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot); + + if (io_remap_pfn_range(vma, vma->vm_start, + (uctx->uar_base >> PAGE_SHIFT) + 1, + PAGE_SIZE, vma->vm_page_prot)) { + hvnd_error("io_remap_pfn_range failure\n"); + return -EAGAIN; + } + } else { + hvnd_error("check code\n"); + return -EINVAL; + } + + return 0; +} + +static int hvnd_deallocate_pd(struct ib_pd *pd) +{ + + struct hvnd_ucontext *uctx; + struct hvnd_dev *nd_dev; + struct hvnd_ib_pd *hvnd_pd; + struct ib_ucontext *ibuctx = pd->uobject->context; + + hvnd_pd = to_nd_pd(pd); + nd_dev = to_nd_dev(pd->device); + uctx = to_nd_context(ibuctx); + + hvnd_free_handle(nd_dev, uctx, hvnd_pd->handle, + IOCTL_ND_PD_FREE); + + hvnd_drop_uctx_ref(nd_dev, uctx); + return 0; +} + +static struct ib_pd *hvnd_allocate_pd(struct ib_device *ibdev, + struct ib_ucontext *context, + struct ib_udata *udata) +{ + struct hvnd_ucontext *uctx; + struct hvnd_dev *nd_dev; + int ret; + struct hvnd_ib_pd *hvnd_pd; + + if (!context) { + hvnd_error("kernel mode context not supported\n"); + return ERR_PTR(-EINVAL); + } + + hvnd_pd = kzalloc(sizeof(struct hvnd_ib_pd), GFP_KERNEL); + + if (!hvnd_pd) + return ERR_PTR(-ENOMEM); + + uctx = to_nd_context(context); + nd_dev = to_nd_dev(ibdev); + + ret = hvnd_create_pd(uctx, nd_dev, hvnd_pd); + if (ret) { + hvnd_error("hvnd_create_pd failure ret=%d\n", ret); + goto error_cr_pd; + } + + if (context) { + if (ib_copy_to_udata(udata, &hvnd_pd->pdn, sizeof(__u32))) { + hvnd_error("ib_copy_to_udata failure\n"); + ret = -EFAULT; + goto error_fault; + } + } + + hvnd_acquire_uctx_ref(uctx); + return &hvnd_pd->ibpd; + +error_fault: + hvnd_free_handle(nd_dev, uctx, hvnd_pd->handle, + IOCTL_ND_PD_FREE); + +error_cr_pd: + kfree(hvnd_pd); + return ERR_PTR(ret); +} + +static int hvnd_query_pkey(struct ib_device *ibdev, u8 port, u16 index, + u16 *pkey) +{ + debug_check(__func__, __LINE__); + *pkey = 0; + return 0; +} + +static int hvnd_query_gid(struct ib_device *ibdev, u8 port, int index, + union ib_gid *gid) +{ + int ret; + struct hvnd_dev *nd_dev = to_nd_dev(ibdev); + + debug_check(__func__, __LINE__); + + ret = wait_for_completion_timeout(&nd_dev->addr_set, 60*HZ); + if (!ret) + return -ETIMEDOUT; + + memset(&(gid->raw[0]), 0, sizeof(gid->raw)); + memcpy(&(gid->raw[0]), nd_dev->mac_addr, 6); + return 0; +} + +static int hvnd_query_device(struct ib_device *ibdev, + struct ib_device_attr *props, + struct ib_udata *udata) +{ + struct hvnd_dev *nd_dev = to_nd_dev(ibdev); + struct adapter_info_v2 *adap_info; + + if (!nd_dev->query_pkt_set) { + hvnd_error("query packet not received yet\n"); + return 0; + } + + adap_info = &nd_dev->query_pkt.ioctl.ad_info; + + memset(props, 0, sizeof(*props)); + + /* + * Copy the relevant properties out. + */ + props->fw_ver = 0; + props->device_cap_flags = 0; + //props->device_cap_flags |= IB_DEVICE_BAD_PKEY_CNTR; + //props->device_cap_flags |= IB_DEVICE_BAD_QKEY_CNTR; + //props->device_cap_flags |= IB_DEVICE_XRC; + + props->vendor_id = 0x15b3; + props->vendor_part_id = adap_info->device_id; + + props->max_mr_size = ~0ull; + props->page_size_cap = PAGE_SIZE; + props->max_qp = 16384; + props->max_qp_wr = min(adap_info->max_recv_q_depth, + adap_info->max_initiator_q_depth); + + props->max_sge = min(adap_info->max_initiator_sge, + adap_info->max_recv_sge); + props->max_cq = 0x1FFFF; + props->max_cqe = adap_info->max_completion_q_depth; + props->max_mr = 16384; + props->max_pd = 16384; + + props->max_qp_rd_atom = adap_info->max_inbound_read_limit; + props->max_qp_init_rd_atom = adap_info->max_outbound_read_limit; + props->max_res_rd_atom = props->max_qp_rd_atom * props->max_qp; + props->max_srq = 16384; + props->max_srq_wr = adap_info->max_recv_q_depth; + props->max_srq_sge = adap_info->max_recv_sge; + + return 0; +} + +static int hvnd_query_port(struct ib_device *ibdev, u8 port, + struct ib_port_attr *props) +{ + memset(props, 0, sizeof(struct ib_port_attr)); + + props->max_mtu = IB_MTU_4096; + props->active_mtu = IB_MTU_4096; + + /* + * KYS: TBD need to base this on netdev. + */ + props->state = IB_PORT_ACTIVE; + + props->port_cap_flags = IB_PORT_CM_SUP; + + props->gid_tbl_len = 1; + props->pkey_tbl_len = 1; + props->active_width = 1; + props->active_speed = IB_SPEED_DDR; //KYS: check + props->max_msg_sz = -1; + + return 0; +} + +static enum rdma_link_layer +hvnd_get_link_layer(struct ib_device *device, u8 port) +{ + return IB_LINK_LAYER_ETHERNET; +} + +static ssize_t hvnd_show_rev(struct device *dev, struct device_attribute *attr, + char *buf) +{ + return 0; +} + +static ssize_t hvnd_show_fw_ver(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return 0; +} + +static ssize_t hvnd_show_hca(struct device *dev, struct device_attribute *attr, + char *buf) +{ + return 0; +} + +static ssize_t hvnd_show_board(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return 0; +} + +static int hvnd_get_port_immutable(struct ib_device *ibdev, u8 port_num, + struct ib_port_immutable *immutable) +{ + struct ib_port_attr attr; + int err; + + err = hvnd_query_port(ibdev, port_num, &attr); + if (err) + return err; + + immutable->pkey_tbl_len = attr.pkey_tbl_len; + immutable->gid_tbl_len = attr.gid_tbl_len; + immutable->core_cap_flags = RDMA_CORE_PORT_IWARP; + + return 0; +} + +static struct ib_qp *hvnd_ib_create_qp(struct ib_pd *pd, + struct ib_qp_init_attr *attrs, + struct ib_udata *udata) +{ + struct hvnd_ucontext *uctx; + struct hvnd_dev *nd_dev; + struct mlx4_ib_create_qp ucmd; + struct hvnd_qp *qp; + int ret = 0; + struct hvnd_ib_pd *hvnd_pd = to_nd_pd(pd); + struct hvnd_cq *send_cq = to_nd_cq(attrs->send_cq); + struct hvnd_cq *recv_cq = to_nd_cq(attrs->recv_cq); + + uctx = get_uctx_from_pd(pd); + nd_dev = to_nd_dev(pd->device); + + if (attrs->qp_type != IB_QPT_RC) { + hvnd_error("attrs->qp_type=%d not IB_QPT_RC\n", attrs->qp_type); + return ERR_PTR(-EINVAL); + } + + qp = kzalloc(sizeof(*qp), GFP_KERNEL); + if (!qp) { + ret = -ENOMEM; + goto err_done; + } + + qp->uctx = uctx; + + if (ib_copy_from_udata(&ucmd, udata, sizeof(ucmd))) { + hvnd_error("ib_copy_from_udata failed\n"); + ret = -EFAULT; + goto err_ucpy; + } + + qp->qp_buf = (void *)ucmd.buf_addr; + qp->db_addr = (void *)ucmd.db_addr; + qp->log_sq_bb_count = ucmd.log_sq_bb_count; + qp->log_sq_stride = ucmd.log_sq_stride; + qp->sq_no_prefetch = ucmd.sq_no_prefetch; + qp->port = attrs->port_num; + + init_waitqueue_head(&qp->wait); + atomic_set(&qp->refcnt, 1); + + qp->recv_cq = recv_cq; + qp->send_cq = send_cq; + qp->nd_dev = nd_dev; + + qp->receive_cq_handle = recv_cq->cq_handle; + qp->initiator_cq_handle = send_cq->cq_handle; + qp->pd_handle = hvnd_pd->handle; + qp->cq_notify = false; + + qp->ibqp.qp_num = attrs->qp_type == IB_QPT_SMI ? 0 : 1; + + qp->max_inline_data = attrs->cap.max_inline_data; + + qp->initiator_q_depth = attrs->cap.max_send_wr; + qp->initiator_request_sge = attrs->cap.max_send_sge; + + + qp->receive_q_depth = attrs->cap.max_recv_wr; + qp->receive_request_sge = attrs->cap.max_recv_sge; + + set_rq_size(nd_dev, &attrs->cap, qp); + + set_user_sq_size(nd_dev, qp, &ucmd); + + qp->umem = ib_umem_get(&uctx->ibucontext, ucmd.buf_addr, + qp->buf_size, 0, 0); + if (IS_ERR(qp->umem)) { + ret = PTR_ERR(qp->umem); + hvnd_error("ib_umem_get failed ret=%d\n", ret); + goto err_ucpy; + } + + ret = hvnd_db_map_user(uctx, ucmd.db_addr, &qp->db_umem); + + if (ret) { + hvnd_error("hvnd_db_map_user failed ret=%d\n", ret); + goto err_db_map; + } + + ret = hvnd_create_qp(nd_dev, uctx, qp); + + if (ret) { + hvnd_error("hvnd_create_qp failed ret=%d\n", ret); + goto err_qp; + } + + hvnd_acquire_uctx_ref(uctx); + + qp->ibqp.qp_num = qp->qpn; + qp->ibqp.qp_type = IB_QPT_RC; + + + return &qp->ibqp; + +err_qp: + hvnd_db_unmap_user(uctx, ucmd.db_addr); + +err_db_map: + ib_umem_release(qp->umem); + +err_ucpy: + kfree(qp); +err_done: + return ERR_PTR(ret); +} + +static int hvnd_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, + int attr_mask, struct ib_udata *udata) +{ + struct hvnd_qp *qp = to_nd_qp(ibqp); + struct hvnd_dev *nd_dev = to_nd_dev(ibqp->device); + enum ib_qp_state cur_state, new_state; + int ret = 0; + + if (attr != NULL) { + cur_state = (attr_mask & IB_QP_CUR_STATE) + ? attr->cur_qp_state : qp->qp_state; + new_state = (attr_mask & IB_QP_STATE) + ? attr->qp_state : cur_state; + + hvnd_debug("qp->qp_state is %d new state is %d\n", + qp->qp_state, new_state); + hvnd_debug("current qp state is %d\n", cur_state); + if (attr_mask & IB_QP_STATE) { + /* Ensure the state is valid */ + if (attr->qp_state < 0 || attr->qp_state > IB_QPS_ERR) { + hvnd_error("incorrect qp state attr->qp_state=%d\n", + attr->qp_state); + return -EINVAL; + } + + if (qp->qp_state != new_state) { + qp->qp_state = new_state; + /* + * The only state transition supported is the + * transition to error state. + */ + switch (new_state) { + case IB_QPS_ERR: + case IB_QPS_SQD: + ret = hvnd_flush_qp(nd_dev, qp->uctx, + qp); + + if (ret) + hvnd_error("hvnd_flush_qp failed ret=%d\n", ret); + + // immediately notify the upper layer + // on disconnection + if (!ret && qp->connector) + hvnd_process_notify_disconnect( + qp->connector, + STATUS_SUCCESS); + + return ret; + + default: + break; + } + } + } + } + return 0; +} + + +static int hvnd_ib_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, + int attr_mask, struct ib_qp_init_attr *init_attr) +{ + struct hvnd_qp *qp = to_nd_qp(ibqp); + + memset(attr, 0, sizeof(*attr)); + memset(init_attr, 0, sizeof(*init_attr)); + + attr->qp_state = qp->qp_state; + + init_attr->cap.max_send_wr = qp->max_send_wr; + init_attr->cap.max_recv_wr = qp->max_recv_wr; + + init_attr->cap.max_send_sge = qp->max_send_sge; + init_attr->cap.max_recv_sge = qp->max_recv_sge; + init_attr->cap.max_inline_data = qp->max_inline_data; + + init_attr->sq_sig_type = IB_SIGNAL_ALL_WR; + + return 0; +} + +static void hvnd_refuse_connection(struct hvnd_ep_obj *connector, int status); +static int hvnd_destroy_qp(struct ib_qp *ib_qp) +{ + int ret; + struct hvnd_qp *qp = to_nd_qp(ib_qp); + struct hvnd_dev *nd_dev = to_nd_dev(ib_qp->device); + u64 jiffies; + + if (!qp->connector) { + hvnd_warn("error: connector is NULL; skip destroying connector\n"); + goto free_qp; + } + + /* + * should we flush the qp first on ctrl-C?, + * no need to disconnect on abrupt shutdown? + */ + if (qp->qp_state != IB_QPS_ERR && qp->qp_state != IB_QPS_SQD) { + hvnd_warn("qp_state=%d, doing abrupt disconnect\n", + qp->qp_state); + hvnd_flush_qp(nd_dev, qp->uctx, qp); + + ep_stop(qp->connector); + + // now no pending activity is possible on the connector + + switch (qp->connector->cm_state) { + + case hvnd_cm_idle: + case hvnd_cm_connect_reply_refused: + case hvnd_cm_connect_request_sent: + case hvnd_cm_close_sent: + hvnd_warn("cm_state = %d not doing anything\n", + qp->connector->cm_state); + break; + + case hvnd_cm_connect_received: + hvnd_warn("cm_state = %d refusing pending connection request\n", + qp->connector->cm_state); + hvnd_refuse_connection(qp->connector, -ECONNREFUSED); + break; + + case hvnd_cm_connect_reply_sent: + case hvnd_cm_established_sent: + case hvnd_cm_accept_sent: + hvnd_warn("cm_state = %d notifying disconnect on existing connection\n", + qp->connector->cm_state); + hvnd_process_notify_disconnect(qp->connector, + STATUS_CANCELLED); + break; + + default: + hvnd_error("unknown cm_state = %d\n", + qp->connector->cm_state); + + } + goto free_connector; + } else { + hvnd_debug("qp_state=%d, doing normal disconnect\n", + qp->qp_state); + } + + if (!ep_add_work_pending(qp->connector)) + goto free_connector; + + init_completion(&qp->connector->disconnect_event); + + /* + * First issue a disconnect on the connector. + */ + + hvnd_debug("calling hvnd_connector_disconnect\n"); + ret = hvnd_connector_disconnect(nd_dev, qp->uctx, + qp->connector->ep_handle, + qp->connector); + if (ret) { + ep_del_work_pending(qp->connector); + hvnd_error("disconnect: retval is %d\n", ret); + ep_stop(qp->connector); + goto free_connector; + } + /* + * Now wait for the disconnect. + */ + jiffies = get_jiffies_64(); + wait_for_completion(&qp->connector->disconnect_event); + hvnd_debug("Completed disconnect connector=%p jiffies=%llu\n", + qp->connector, get_jiffies_64() - jiffies); + + /* + * Now free up the connector and drop the reference on uctx. + */ + + ep_stop(qp->connector); + +free_connector: + hvnd_debug("destroying connector handle: %p\n", + (void *) qp->connector->ep_handle); + hvnd_free_handle(nd_dev, qp->uctx, + qp->connector->ep_handle, + IOCTL_ND_CONNECTOR_FREE); + + hvnd_drop_uctx_ref(nd_dev, qp->uctx); + hvnd_destroy_ep(qp->connector); + qp->connector = NULL; +free_qp: + atomic_dec(&qp->refcnt); + hvnd_debug("Waiting for the ref cnt to go to 0\n"); + + wait_event(qp->wait, !atomic_read(&qp->refcnt)); + + hvnd_debug("About to destroy qp\n"); + hvnd_db_unmap_user(qp->uctx, (u64)qp->db_addr); + ib_umem_release(qp->umem); + + hvnd_debug("About to free qp\n"); + ret = hvnd_free_qp(nd_dev, qp->uctx, qp); + + if (ret == 0) { + hvnd_drop_uctx_ref(nd_dev, qp->uctx); + kfree(qp); + } else { + hvnd_error("free qp failed: ret is %d\n", ret); + } + + return ret; +} + +static struct ib_cq *hvnd_ib_create_cq(struct ib_device *ibdev, + const struct ib_cq_init_attr *attr, + struct ib_ucontext *ib_context, + struct ib_udata *udata) +{ + struct hvnd_ucontext *uctx; + struct hvnd_dev *nd_dev; + struct mlx4_ib_create_cq ucmd; + struct hvnd_cq *cq; + int ret = 0; + int entries = attr->cqe; + + uctx = to_nd_context(ib_context); + nd_dev = to_nd_dev(ibdev); + + if (entries < 1 || entries > uctx->max_cqe) { + hvnd_error("incorrct entries=%d\n", entries); + ret = -EINVAL; + goto err_done; + } + + cq = kzalloc(sizeof(*cq), GFP_KERNEL); + if (!cq) { + ret = -ENOMEM; + goto err_done; + } + + entries = roundup_pow_of_two(entries + 1); + cq->ibcq.cqe = entries - 1; + cq->entries = entries; + cq->uctx = uctx; + + if (ib_copy_from_udata(&ucmd, udata, sizeof(ucmd))) { + hvnd_error("ib_copy_from_udata failed\n"); + ret = -EFAULT; + goto err_ucpy; + } + + cq->cq_buf = (void *)ucmd.buf_addr; + cq->db_addr = (void *)ucmd.db_addr; + cq->arm_sn = 0; + + /* + * Initialize the IRP state. Need to have a separate irp state + * for CQ; for now share it with Listener/connector. + */ + ret = hvnd_init_ep(&cq->ep_object, NULL, ND_CQ, nd_dev, uctx); + + if (ret) { + hvnd_error("hvnd_init_ep failed ret=%d\n", ret); + goto err_ucpy; + } + + cq->ep_object.cq = cq; + cq->monitor = true; + + cq->umem = ib_umem_get(ib_context, ucmd.buf_addr, + (entries * uctx->cqe_size), + IB_ACCESS_LOCAL_WRITE, 1); + if (IS_ERR(cq->umem)) { + ret = IS_ERR(cq->umem); + hvnd_error("ib_umem_get failed ret=%d\n", ret); + goto err_ucpy; + } + + ret = hvnd_db_map_user(uctx, ucmd.db_addr, &cq->db_umem); + + if (ret) { + hvnd_error("hvnd_db_map_user failed ret=%d\n", ret); + goto err_db_map; + } + + ret = hvnd_create_cq(nd_dev, uctx, cq); + + if (ret) { + hvnd_error("hvnd_create_cq failed ret=%d\n", ret); + goto err_cq; + } + + cq->ep_object.ep_handle = cq->cq_handle; + + if (ib_copy_to_udata(udata, &cq->cqn, sizeof(__u32))) { + hvnd_error("ib_copy_to_udata failed\n"); + ret = -EFAULT; + goto err_ucpy_out; + } + + if (!disable_cq_notify) { + + if (!ep_add_work_pending(&cq->ep_object)) + goto err_ucpy_out; + + ret = hvnd_notify_cq(nd_dev, cq, ND_CQ_NOTIFY_ANY, + (u64)&cq->ep_object); + + if (ret) { + ep_del_work_pending(&cq->ep_object); + hvnd_error("hvnd_notify_cq failed ret=%d\n", ret); + goto err_ucpy_out; + } + } + + hvnd_acquire_uctx_ref(uctx); + + return &cq->ibcq; + +err_ucpy_out: + hvnd_destroy_cq(nd_dev, cq); + +err_cq: + hvnd_db_unmap_user(uctx, ucmd.db_addr); + +err_db_map: + ib_umem_release(cq->umem); + +err_ucpy: + kfree(cq); +err_done: + return ERR_PTR(ret); +} + +static struct ib_qp *hvnd_get_qp(struct ib_device *dev, int qpn) +{ + struct hvnd_dev *nd_dev; + struct hvnd_qp *qp = NULL; + + nd_dev = to_nd_dev(dev); + qp = get_qpp(nd_dev, qpn); + return qp ? &qp->ibqp : NULL; +} + +static int hvnd_ib_destroy_cq(struct ib_cq *ib_cq) +{ + struct hvnd_ucontext *uctx; + struct hvnd_dev *nd_dev; + struct hvnd_cq *cq; + + cq = to_nd_cq(ib_cq); + uctx = cq->uctx; + nd_dev = to_nd_dev(uctx->ibucontext.device); + + cq->monitor = false; + +// hvnd_cancel_io(&cq->ep_object); + ep_stop(&cq->ep_object); + + hvnd_deinit_ep(&cq->ep_object); + + hvnd_db_unmap_user(uctx, (u64)cq->db_addr); + ib_umem_release(cq->umem); + + hvnd_destroy_cq(nd_dev, cq); + + hvnd_drop_uctx_ref(nd_dev, uctx); + kfree(cq); + + return 0; +} + +static int hvnd_resize_cq(struct ib_cq *cq, int cqe, struct ib_udata *udata) +{ + /* + * NDDirect does not support resizing CQ. + */ + hvnd_info("check code\n"); + return -ENOSYS; +} + +static int hvnd_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc) +{ + hvnd_info("check code\n"); + return 0; +} + +static struct ib_mr *hvnd_get_dma_mr(struct ib_pd *pd, int acc) +{ + hvnd_info("check code\n"); + return NULL; +} + +static void debug_dump_umem(struct ib_umem *umem) +{ +#ifdef HVND_MEM_DEBUG + struct ib_umem_chunk *chunk; + struct scatterlist *sg; + int len, j, entry; + int shift = ffs(umem->page_size) - 1; + + hvnd_debug("umem=%p\n", umem); + hvnd_debug("context=%p length=%lu offset=%d page_size=%d writable=%d hugetlb=%d\n", + umem->context, + umem->length, + umem->offset, + umem->page_size, + umem->writable, + umem->hugetlb); + + list_for_each_entry(chunk, &umem->chunk_list, list) { + hvnd_debug("chunk->nmap=%d\n", chunk->nmap); + for (j = 0; j < chunk->nmap; ++j) { + sg = &chunk->page_list[j]; + hvnd_debug("sg_dma_len=%d sg_dma_address=%llx\n", + sg_dma_len(sg), sg_dma_address(sg)); + hvnd_debug("page_link=%lx offset=%u length=%u\n", + sg->page_link, sg->offset, sg->length); + len = sg_dma_len(&chunk->page_list[j]) >> shift; + for_each_sg(&chunk->page_list[j], sg, len, entry) { + hvnd_debug("PFN=%lu\n", + page_to_pfn(sg_page(sg))); + } + } + } +#endif +} + + +static struct ib_mr *hvnd_reg_user_mr(struct ib_pd *pd, u64 start, u64 length, + u64 virt, int acc, struct ib_udata *udata) +{ + int err = 0; + struct hvnd_ib_pd *hvndpd = to_nd_pd(pd); + struct hvnd_mr *mr; + + mr = kmalloc(sizeof(*mr), GFP_KERNEL); + if (!mr) + return ERR_PTR(-ENOMEM); + mr->pd = hvndpd; + + mr->umem = ib_umem_get(pd->uobject->context, start, length, acc, 0); + if (IS_ERR(mr->umem)) { + err = PTR_ERR(mr->umem); + hvnd_error("ib_umem_get failed ret=%d\n", err); + kfree(mr); + return ERR_PTR(err); + } + + debug_dump_umem(mr->umem); + + mr->start = start; + mr->length = length; + mr->virt = virt; + mr->acc = acc; + + hvnd_debug("start=%llx length=%llx virt=%llx acc=%d\n", + start, length, virt, acc); + + /* + * First create a memory region. + */ + err = hvnd_cr_mr(to_nd_dev(pd->device), + to_nd_context(pd->uobject->context), hvndpd->handle, + &mr->mr_handle); + if (err) { + hvnd_error("cr_mr failed; ret is %d\n", err); + goto err; + } + + err = hvnd_mr_register(to_nd_dev(pd->device), + to_nd_context(pd->uobject->context), mr); + + if (err) + goto err0; + + hvnd_acquire_uctx_ref(to_nd_context(pd->uobject->context)); + + return &mr->ibmr; + +err0: + hvnd_free_mr(to_nd_dev(pd->device), + to_nd_context(pd->uobject->context), mr->mr_handle); +err: + ib_umem_release(mr->umem); + kfree(mr); + return ERR_PTR(err); +} + + + +static int hvnd_dereg_mr(struct ib_mr *ib_mr) +{ + int ret; + struct hvnd_mr *mr = to_nd_mr(ib_mr); + struct hvnd_ucontext *uctx = to_nd_context(ib_mr->pd->uobject->context); + struct hvnd_dev *nd_dev = to_nd_dev(ib_mr->device); + + + hvnd_debug("dereg_mr entering\n"); + + ret = hvnd_deregister_mr(nd_dev, uctx, mr->mr_handle); + + if (ret) { + hvnd_error("hvnd_deregister_mr() failed: %x\n", ret); + return ret; + } + /* + * Now free up the memory region. + */ + + ret = hvnd_free_mr(nd_dev, uctx, mr->mr_handle); + if (ret) { + hvnd_error("hvnd_free_mr() failed: %x\n", ret); + return ret; + } + + ib_umem_release(mr->umem); + + hvnd_drop_uctx_ref(nd_dev, uctx); + kfree(mr); + + hvnd_debug("dereg_mr done\n"); + return 0; +} + +static struct ib_mw *hvnd_alloc_mw(struct ib_pd *pd, enum ib_mw_type type, + struct ib_udata *udata) +{ + hvnd_info("check code\n"); + return NULL; +} + +static int hvnd_dealloc_mw(struct ib_mw *mw) +{ + debug_check(__func__, __LINE__); + return 0; +} + +static int hvnd_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags) +{ + struct hvnd_ucontext *uctx; + struct hvnd_dev *nd_dev; + struct hvnd_cq *cq; + + cq = to_nd_cq(ibcq); + uctx = cq->uctx; + nd_dev = to_nd_dev(uctx->ibucontext.device); + + + debug_check(__func__, __LINE__); + + return 0; +} + +static int hvnd_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, + struct ib_send_wr **bad_wr) +{ + debug_check(__func__, __LINE__); + return 0; +} + +int hvnd_post_receive(struct ib_qp *ibqp, struct ib_recv_wr *wr, + struct ib_recv_wr **bad_wr) +{ + debug_check(__func__, __LINE__); + return 0; +} + +static int hvnd_resolve_addr(struct sockaddr_in *laddr, + struct sockaddr_in *raddr, + struct if_physical_addr *phys_addrstruct) +{ + int ret; + + phys_addrstruct->length = ETH_ALEN; + ret = hvnd_get_neigh_mac_addr((struct sockaddr *)laddr, + (struct sockaddr *)raddr, + phys_addrstruct->addr); + + hvnd_debug("Dest MAC is %pM\n", phys_addrstruct->addr); + return ret; +} + +static int hvnd_connect(struct iw_cm_id *cm_id, + struct iw_cm_conn_param *conn_param) +{ + int ret = 0; + struct hvnd_dev *nd_dev; + struct hvnd_ep_obj *ep_object; + struct sockaddr_in *raddr = (struct sockaddr_in *)&cm_id->remote_addr; + struct sockaddr_in *laddr = (struct sockaddr_in *)&cm_id->local_addr; + struct hvnd_qp *qp; + struct if_physical_addr phys_addrstruct; + union nd_sockaddr_inet dest_addr; + u64 connector_handle; + union nd_sockaddr_inet addr; + char addr_buf[50]; + + if (cm_id->remote_addr.ss_family != AF_INET) { + hvnd_error("cm_id->remote_addr.ss_family=%d not AF_INET\n", + cm_id->remote_addr.ss_family); + return -ENOSYS; + } + + qp = get_qpp(to_nd_dev(cm_id->device), conn_param->qpn); + + if (!qp) { + hvnd_error("failed to find qp conn_param->qpn=%d\n", + conn_param->qpn); + return -EINVAL; + } + + cm_id->provider_data = qp; + cm_id->add_ref(cm_id); + qp->cm_id = cm_id; + + /* + * Set the read/write limits. + * Can we change the limits on a created QP? Luke? + */ + nd_dev = to_nd_dev(cm_id->device); + ep_object = hvnd_setup_ep(cm_id, ND_CONNECTOR, nd_dev, qp->uctx); + hvnd_debug("active connection: local irp is %d\n", + ep_object->local_irp); + if (!ep_object) { + hvnd_error("hvnd_setup_ep failure\n"); + ret = -ENOMEM; + goto err_limit; + } + + ret = hvnd_cr_connector(nd_dev, qp->uctx, + &connector_handle); + + if (ret) { + hvnd_error("hvnd_cr_connector failure ret=%d\n", ret); + goto err_cr_connector; + } + + hvnd_acquire_uctx_ref(qp->uctx); + ep_object->ep_handle = connector_handle; + ep_object->incoming = false; + qp->connector = ep_object; + + /* + * Bind the local address to the connector. + */ + hvnd_debug("Connect local address is %s\n", + debug_inet_ntoa(laddr->sin_addr, addr_buf)); + + memcpy(&addr.ipv4, laddr, sizeof(struct sockaddr_in)); + hvnd_debug("CONNECT AF %d port %d addr %s\n", + addr.ipv4.sin_family, addr.ipv4.sin_port, + debug_inet_ntoa(addr.ipv4.sin_addr, addr_buf)); + + ret = hvnd_bind_connector(nd_dev, qp->uctx, + connector_handle, + &addr); + + if (ret) { + hvnd_error("hvnd_bind_connector failed ret=%d\n", ret); + goto err_bind_connector; + } + + ret = hvnd_resolve_addr(laddr, raddr, &phys_addrstruct); + if (ret) { + hvnd_error("hvnd_resolve_addr failed ret=%d\n", ret); + goto err_bind_connector; + } + + memcpy(&dest_addr.ipv4, raddr, sizeof(struct sockaddr_in)); + + + + /* + * Now attempt to connect. + */ + + hvnd_debug("About to initiate connection\n"); + + if (!ep_add_work_pending(ep_object)) + goto err_bind_connector; + + ep_object->cm_state = hvnd_cm_connect_received; + ret = hvnd_connector_connect(nd_dev, qp->uctx, + ep_object->ep_handle, + conn_param->ird, conn_param->ord, + conn_param->private_data_len, + (u8 *)conn_param->private_data, + qp->qp_handle, + &phys_addrstruct, &dest_addr, + ep_object); + + if (ret == 0) + return 0; + + ep_object->cm_state = hvnd_cm_idle; + ep_del_work_pending(ep_object); + hvnd_error("hvnd_connector_connect failed ret=%d\n", ret); + +err_bind_connector: + qp->connector = NULL; + hvnd_free_connector(nd_dev, qp->uctx, + connector_handle); + hvnd_drop_uctx_ref(nd_dev, qp->uctx); + +err_cr_connector: + kfree(ep_object); + +err_limit: + cm_id->provider_data = NULL; + qp->cm_id = NULL; + cm_id->rem_ref(cm_id); + return ret; +} + +static int hvnd_accept_cr(struct iw_cm_id *cm_id, + struct iw_cm_conn_param *conn_param) +{ + int ret = 0; + struct hvnd_dev *nd_dev; + struct hvnd_qp *qp; + struct hvnd_ep_obj *connector; + enum ibv_qp_state new_qp_state; + + hvnd_debug("Accepting connection - PASSIVE\n"); + nd_dev = to_nd_dev(cm_id->device); + qp = get_qpp(to_nd_dev(cm_id->device), conn_param->qpn); + + if (!qp) { + hvnd_error("get_qpp failed conn_param->qpn=%d\n", + conn_param->qpn); + return -EINVAL; + } + + + connector = (struct hvnd_ep_obj *)cm_id->provider_data; + qp->connector = connector; + + if (connector == NULL) { + hvnd_error("NULL connector!\n"); + return -EINVAL; + } + hvnd_debug("connector's cm_id is %p caller cm_id=%p\n", + connector->cm_id, cm_id); + + connector->cq = qp->recv_cq; + + + /* + * Setup state for the accepted connection. + */ + cm_id->add_ref(cm_id); + connector->cm_id = cm_id; + if (conn_param == NULL) { + hvnd_error("NULL conn_param!\n"); + return -EINVAL; + } + + connector->ord = conn_param->ord; + connector->ird = conn_param->ird; + + if (!ep_add_work_pending(connector)) + goto error; + + init_completion(&connector->connector_accept_event); + + ret = hvnd_connector_accept(nd_dev, qp->uctx, connector->ep_handle, + qp->qp_handle, conn_param->ird, + conn_param->ord, + conn_param->private_data_len, + conn_param->private_data, + &new_qp_state, connector); + + if (ret) { + ep_del_work_pending(connector); + hvnd_error("connector accept failed\n"); + goto error; + } + + wait_for_completion(&connector->connector_accept_event); + ret = connector->connector_accept_status; + + if (ret) { + hvnd_error("connector_accept failed status=%x\n", ret); + ret = -EIO; + goto error; + } + + hvnd_debug("Passive Connection Accepted; new qp state is %d\n", + new_qp_state); + connector->cm_state = hvnd_cm_accept_sent; + return 0; + +error: + ep_stop(connector); + connector->cm_id = NULL; + connector->cm_state = hvnd_cm_idle; + + qp->connector = NULL; + cm_id->rem_ref(cm_id); + + return ret; +} + +static int hvnd_reject_cr(struct iw_cm_id *cm_id, const void *pdata, + u8 pdata_len) +{ + debug_check(__func__, __LINE__); + return 0; +} + +void hvnd_process_disconnect(struct hvnd_ep_obj *ep_object, int status) +{ + struct iw_cm_event cm_event; + + switch (status) { + case STATUS_SUCCESS: + case STATUS_CANCELLED: + break; + + default: + hvnd_warn("disconnect complete failed: status:%d\n", status); + } + + + hvnd_debug("active disconnect processed\n"); + memset(&cm_event, 0, sizeof(cm_event)); + + complete(&ep_object->disconnect_event); +} + + +void hvnd_process_notify_disconnect(struct hvnd_ep_obj *ep_object, int status) +{ + struct iw_cm_event cm_event; + + // make sure we only disconnect once + if (atomic_xchg(&ep_object->disconnect_notified, 1)) + return; + + /* + * Turn off CQ monitoring. + */ + if (ep_object->cq) + ep_object->cq->monitor = false; + + switch (ep_object->cm_state) { + case hvnd_cm_connect_reply_sent: + case hvnd_cm_established_sent: + case hvnd_cm_accept_sent: + break; + + default: + hvnd_error("unexpected cm_state=%d\n", ep_object->cm_state); + return; + } + + switch (status) { + case STATUS_SUCCESS: + case STATUS_CANCELLED: + case STATUS_DISCONNECTED: + break; + + default: + hvnd_warn("notify disconnect complete failed: status:%d\n", + status); + } + + hvnd_debug("passive disconnect notified\n"); + memset(&cm_event, 0, sizeof(cm_event)); + + /* + * Other end disconnected. + * Connection has been disconnected; + * notify the cm layer. + */ + cm_event.status = -ECONNRESET; + cm_event.event = IW_CM_EVENT_CLOSE; + + if ((ep_object->cm_id) && + (ep_object->cm_id->event_handler)) { + + ep_object->cm_id->event_handler(ep_object->cm_id, &cm_event); + + ep_object->cm_id->rem_ref(ep_object->cm_id); + ep_object->cm_state = hvnd_cm_close_sent; + } +} + +void hvnd_process_connector_accept(struct hvnd_ep_obj *ep_object, int status) +{ + struct iw_cm_event cm_event; + int ret; + + /* + * This is the problem area the return status may be: + * 1: 0xc00000b5 (3221225653) - {Device Timeout} The specified I/O + * operation on %hs was not completed before the time-out period + * expired + * 2: NTSTATUS 0xc0000241 (3221226049) - The transport connection was + * aborted by the local system. + * If we do nothing here, iwcm will wait for IW_CM_EVENT_ESTABLISHED + * forever, and unable to clean shutdown need to fail the call earlier + * on accept. + */ + + ep_object->connector_accept_status = status; + + if (status) { + hvnd_error("Connector accept failed; status is %x\n", status); + complete(&ep_object->connector_accept_event); + return; + } + + memset(&cm_event, 0, sizeof(cm_event)); + cm_event.event = IW_CM_EVENT_ESTABLISHED; + cm_event.ird = ep_object->ird; + cm_event.ord = ep_object->ord; + cm_event.provider_data = (void *)ep_object; + + /* + * We have successfully passively accepted the + * incoming connection. + */ + + hvnd_debug("Passive connection accepted!!\n"); + if ((ep_object->cm_id) && + (ep_object->cm_id->event_handler)) { + ep_object->cm_id->event_handler(ep_object->cm_id, &cm_event); + ep_object->cm_state = hvnd_cm_established_sent; + } + + complete(&ep_object->connector_accept_event); + + /* + * Request notification if the other end + * were to disconnect. + */ + if (!ep_add_work_pending(ep_object)) + return; + + ret = hvnd_connector_notify_disconnect(ep_object->nd_dev, + ep_object->uctx, + ep_object->ep_handle, + ep_object); + + if (ret) { + ep_del_work_pending(ep_object); + hvnd_error("Connector notify disconnect failed; ret: %d\n", + ret); + } +} + + +void hvnd_process_cq_event_pending(struct hvnd_ep_obj *ep_object, + int status) +{ + + struct ib_cq *ibcq; + struct hvnd_cq *cq; + + cq = ep_object->cq; + ibcq = &ep_object->cq->ibcq; + + if (!cq->monitor) + return; + + // call the previous CQ complete + if ((status == STATUS_PENDING) && cq->upcall_pending && + ibcq->comp_handler) { + ibcq->comp_handler(ibcq, ibcq->cq_context); + cq->upcall_pending = false; + hvnd_debug("CQ comp_handler called arm_sn=%d\n", cq->arm_sn); + } + + if ((status != STATUS_PENDING) && ibcq->comp_handler && + ibcq->cq_context) { + ibcq->comp_handler(ibcq, ibcq->cq_context); + hvnd_error("CQ comp_handler called status=%x\n", status); + } +} + +void hvnd_process_cq_event_complete(struct hvnd_ep_obj *ep_object, + int status) +{ + struct ib_cq *ibcq; + struct hvnd_cq *cq; + int ret; + + cq = ep_object->cq; + ibcq = &ep_object->cq->ibcq; + + // call hte previous CQ complete + if (cq->upcall_pending && ibcq->comp_handler) { + ibcq->comp_handler(ibcq, ibcq->cq_context); + cq->upcall_pending = false; + hvnd_debug("CQ comp_handler called arm_sn=%d\n", cq->arm_sn); + } + + cq->upcall_pending = true; + if (!ep_add_work_pending(ep_object)) + return; + + ret = hvnd_notify_cq(ep_object->nd_dev, + ep_object->cq, + ND_CQ_NOTIFY_ANY, + (u64)ep_object); + + if (ret) { + ep_del_work_pending(ep_object); +// hvnd_manage_io_state(ep_object, true); + hvnd_error("hvnd_notify_cq failed ret=%d\n", ret); + } + + if ((status != 0) && (status != STATUS_CANCELLED)) { + if (ibcq->event_handler) { + struct ib_event event; + + event.device = ibcq->device; + event.event = IB_EVENT_CQ_ERR; + event.element.cq = ibcq; + ibcq->event_handler(&event, ibcq->cq_context); + + hvnd_warn("CQ event_handler called status=%x\n", + status); + } + } +} + +int init_cm_event(struct hvnd_ep_obj *ep_object, struct iw_cm_event *cm_event, + int event) +{ + struct sockaddr_in *laddr = (struct sockaddr_in *)&cm_event->local_addr; + struct sockaddr_in *raddr = (struct sockaddr_in *)&cm_event->remote_addr; + struct nd_read_limits rd_limits; + union nd_sockaddr_inet local_addr; + union nd_sockaddr_inet remote_addr; + int ret; + + /* + * Now get the local address. + */ + ret = hvnd_connector_get_local_addr(ep_object->nd_dev, + ep_object->uctx, + ep_object->ep_handle, + &local_addr); + + if (ret) { + hvnd_error("Connector get addr failed; ret: %d\n", ret); + return ret; + } + /* + * Now get the remote address. + */ + ret = hvnd_connector_get_peer_addr(ep_object->nd_dev, + ep_object->uctx, + ep_object->ep_handle, + &remote_addr); + + if (ret) { + hvnd_error("Connector get peer addr failed; ret: %d\n", ret); + return ret; + } + + /* + * Get other connection parameters. + */ + + ret = hvnd_connector_get_rd_limits(ep_object->nd_dev, + ep_object->uctx, + ep_object->ep_handle, + &rd_limits); + + if (ret) { + hvnd_error("Connector rd limits failed; ret: %d\n", ret); + return ret; + } + + /* + * XXXKYS: Luke: What about the length of the priv data? + */ + ret = hvnd_connector_get_priv_data(ep_object->nd_dev, + ep_object->uctx, + ep_object->ep_handle, + ep_object->priv_data); + + if (ret) { + hvnd_error("Connector get priv data failed; ret: %d\n", ret); + return ret; + } + /* + * Initialize CM structure. + */ + laddr->sin_addr.s_addr = local_addr.ipv4.sin_addr.s_addr; + hvnd_debug("Local addr is %d\n", laddr->sin_addr.s_addr); + laddr->sin_port = local_addr.ipv4.sin_port; + laddr->sin_family = AF_INET; + + raddr->sin_addr.s_addr = remote_addr.ipv4.sin_addr.s_addr; + hvnd_debug("Remote addr is %d\n", raddr->sin_addr.s_addr); + raddr->sin_port = remote_addr.ipv4.sin_port; + raddr->sin_family = AF_INET; + + // KYS; LUke: is it always 148 bytes? + cm_event->private_data_len = MAX_PRIVATE_DATA_LEN; + cm_event->private_data = ep_object->priv_data; + + cm_event->ird = rd_limits.inbound; + cm_event->ord = rd_limits.outbound; + cm_event->event = event; + + ep_object->ird = cm_event->ird; + ep_object->ord = cm_event->ord; + + return 0; +} + +static void hvnd_refuse_connection(struct hvnd_ep_obj *connector, int status) +{ + struct iw_cm_event cm_event; + + memset(&cm_event, 0, sizeof(cm_event)); + + cm_event.event = IW_CM_EVENT_CONNECT_REPLY; + cm_event.status = status; + + hvnd_debug("returning status %d on connector %p\n", status, connector); + + if (connector->cm_id && connector->cm_id->event_handler) { + connector->cm_id->event_handler(connector->cm_id, &cm_event); + connector->cm_id->rem_ref(connector->cm_id); + connector->cm_state = hvnd_cm_connect_reply_refused; + } +} + +void hvnd_process_events(struct work_struct *work) +{ + struct hvnd_work *wrk; + struct nd_read_limits rd_limits; + struct hvnd_ep_obj *ep_object; + struct hvnd_ep_obj *parent; + struct iw_cm_event cm_event; + struct sockaddr_in *laddr = (struct sockaddr_in *)&cm_event.local_addr; + struct sockaddr_in *raddr = (struct sockaddr_in *)&cm_event.remote_addr; + struct ndv_packet_hdr_control_1 *ctrl_hdr; + union nd_sockaddr_inet local_addr; + union nd_sockaddr_inet remote_addr; + struct pkt_nd_get_connection_listener *connection_pkt; + struct iw_cm_id *cm_id = NULL; + int status; + int ioctl; + int ret; + char priv_data[MAX_PRIVATE_DATA_LEN]; + enum ibv_qp_state new_qp_state; + struct incoming_pkt *incoming_pkt; + unsigned long flags; + + memset(&cm_event, 0, sizeof(cm_event)); + memset(&priv_data, 0, MAX_PRIVATE_DATA_LEN); + + + wrk = container_of(work, struct hvnd_work, work); + + /* + * Now call into the connection manager. + */ + ep_object = (struct hvnd_ep_obj *)wrk->callback_arg; + parent = ep_object->parent; + +process_next: + incoming_pkt = NULL; + spin_lock_irqsave(&ep_object->incoming_pkt_list_lock, flags); + if (!list_empty(&ep_object->incoming_pkt_list)) { + incoming_pkt = list_first_entry(&ep_object->incoming_pkt_list, + struct incoming_pkt, + list_entry); + list_del(&incoming_pkt->list_entry); + } + spin_unlock_irqrestore(&ep_object->incoming_pkt_list_lock, flags); + if (incoming_pkt == NULL) + return; + + ctrl_hdr = (struct ndv_packet_hdr_control_1 *)incoming_pkt->pkt; + status = ctrl_hdr->io_status; + ioctl = ctrl_hdr->io_cntrl_code; + + hvnd_debug("Process Events IOCTL is: %s; iostatus failure: %x in work queue\n", + hvnd_get_op_name(ioctl), status); + + if (status != 0) { + bool log_error = true; + + if (ioctl == IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT && + status == STATUS_DISCONNECTED) // expected + log_error = false; + + if (log_error) + hvnd_warn("Process Events IOCTL is: %s; iostatus failure: %x\n", + hvnd_get_op_name(ioctl), status); + } + + cm_event.status = status; + + switch (ep_object->type) { + case ND_CONNECTOR: + switch (ioctl) { + case IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST: + + if (ep_object->parent != NULL) { + + // Do nothing with this connection request if + // listener is stopping + if (!ep_add_work_pending(ep_object->parent)) + break; + + cm_id = ep_object->parent->cm_id; //Listener + } + + connection_pkt = (struct pkt_nd_get_connection_listener *) ctrl_hdr; + + if ((status == 0) || (status == STATUS_CANCELLED)) { + hvnd_get_incoming_connections(ep_object->parent, + ep_object->parent->nd_dev, + ep_object->uctx); + } + + if (status) + goto get_connection_request_done; + + /* + * Now get the local address. + */ + ret = hvnd_connector_get_local_addr(ep_object->nd_dev, + ep_object->uctx, + ep_object->ep_handle, + &local_addr); + + if (ret) { + hvnd_error("Connector get addr failed; ret: %d\n", + ret); + goto get_connection_request_done; + } + /* + * Now get the remote address. + */ + ret = hvnd_connector_get_peer_addr(ep_object->nd_dev, + ep_object->uctx, + ep_object->ep_handle, + &remote_addr); + + if (ret) { + hvnd_error("Connector get peer addr failed; ret: %d\n", + ret); + goto get_connection_request_done; + } + /* + * Get other connection parameters. + */ + + ret = hvnd_connector_get_rd_limits(ep_object->nd_dev, + ep_object->uctx, + ep_object->ep_handle, + &rd_limits); + + if (ret) { + hvnd_error("Connector rd imits failed; ret: %d\n", + ret); + goto get_connection_request_done; + } + + /* + * XXXKYS: Luke: What about the length of the priv data? + */ + ret = hvnd_connector_get_priv_data( + ep_object->nd_dev, ep_object->uctx, + ep_object->ep_handle, ep_object->priv_data); + + if (ret) { + hvnd_error("Connector get priv data failed; ret: %d\n", + ret); + goto get_connection_request_done; + } + + cm_event.event = IW_CM_EVENT_CONNECT_REQUEST; + cm_event.provider_data = (void *)ep_object; + + laddr->sin_addr.s_addr = local_addr.ipv4.sin_addr.s_addr; + hvnd_debug("Local addr is %d\n", + laddr->sin_addr.s_addr); + laddr->sin_port = local_addr.ipv4.sin_port; + laddr->sin_family = AF_INET; + + raddr->sin_addr.s_addr = remote_addr.ipv4.sin_addr.s_addr; + hvnd_debug("Remote addr is %d\n", + raddr->sin_addr.s_addr); + raddr->sin_port = remote_addr.ipv4.sin_port; + raddr->sin_family = AF_INET; + + // KYS; LUke: is it always 148 bytes? + cm_event.private_data_len = MAX_PRIVATE_DATA_LEN; + cm_event.private_data = ep_object->priv_data; + + cm_event.ird = rd_limits.inbound; + cm_event.ord = rd_limits.outbound; + + ep_object->ird = cm_event.ird; + ep_object->ord = cm_event.ord; + + + if ((cm_id != NULL) && cm_id->event_handler) { + cm_id->event_handler(cm_id, &cm_event); + ep_object->cm_state = hvnd_cm_connect_request_sent; + } + +get_connection_request_done: + if (ep_object->parent != NULL) + ep_del_work_pending(ep_object->parent); + break; + + case IOCTL_ND_CONNECTOR_CONNECT: + + cm_event.event = IW_CM_EVENT_CONNECT_REPLY; + if (status == STATUS_TIMEOUT && + ep_object->connector_connect_retry < 3) { + //TIMEOUT retry + + if (!ep_add_work_pending(ep_object)) + goto refuse_connection; + + hvnd_warn("Connector connect timed out, reconnecting... retry count: %d\n", + ep_object->connector_connect_retry); + ep_object->connector_connect_retry++; + ret = hvnd_send_ioctl_pkt(ep_object->nd_dev, + &ep_object->connector_connect_pkt.hdr, + sizeof(ep_object->connector_connect_pkt), + (u64)&ep_object->connector_connect_pkt); + + if (ret) { + hvnd_error("Connector on time out failed: %d\n", + ret); + ep_del_work_pending(ep_object); + goto refuse_connection; + } + break; + } + +refuse_connection: + if (status) { + cm_event.status = -ECONNREFUSED; + if (status == STATUS_TIMEOUT) + cm_event.status = -ETIMEDOUT; + + hvnd_refuse_connection(ep_object, + cm_event.status); + break; + } + + hvnd_debug("ACTIVE Connection ACCEPTED\n"); + ret = init_cm_event(ep_object, &cm_event, + IW_CM_EVENT_CONNECT_REPLY); + if (ret) { + hvnd_error("init_cm_event failed ret=%d\n", + ret); + goto process_done; + } + + ret = hvnd_connector_complete_connect(ep_object->nd_dev, + ep_object->uctx, + ep_object->ep_handle, + &new_qp_state); + if (ret) { + hvnd_error("connector_complete failed\n"); + goto process_done; + } + + cm_event.provider_data = (void *)ep_object; + + if ((ep_object->cm_id) && + (ep_object->cm_id->event_handler)) { + ep_object->cm_id->event_handler( + ep_object->cm_id, &cm_event); + ep_object->cm_state = hvnd_cm_connect_reply_sent; + } + /* + * Rquest notification if the other end + * were to disconnect. + */ + if (!ep_add_work_pending(ep_object)) + goto process_done; + + ret = hvnd_connector_notify_disconnect( + ep_object->nd_dev, ep_object->uctx, + ep_object->ep_handle, ep_object); + + if (ret) { + ep_del_work_pending(ep_object); + hvnd_error("Connector notify disconnect failed; ret: %d\n", + ret); + } + + break; + + case IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT: + hvnd_process_notify_disconnect(ep_object, status); + break; + + + default: + hvnd_error("Unknown Connector IOCTL\n"); + break; + } + break; + default: + hvnd_error("Unknown endpoint object\n"); + break; + } +process_done: + kfree(incoming_pkt); + ep_del_work_pending(ep_object); + + goto process_next; +} + + +static struct hvnd_ep_obj *hvnd_setup_ep(struct iw_cm_id *cm_id, int ep_type, + struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx) +{ + struct hvnd_ep_obj *ep_object; + int ret; + + ep_object = kzalloc(sizeof(struct hvnd_ep_obj), GFP_KERNEL); + + if (!ep_object) + return NULL; + + ret = hvnd_init_ep(ep_object, cm_id, ep_type, nd_dev, uctx); + + if (ret) { + hvnd_error("hvnd_init_ep failed ret=%d\n", ret); + kfree(ep_object); + return NULL; + } + + return ep_object; +} + +/* + * return value: + * true: I/O state is stopped, we should not do upcall + * flase: I/O state is running and normal + */ +#if 0 +static bool hvnd_manage_io_state(struct hvnd_ep_obj *ep, bool failure) +{ + unsigned long flags; + + spin_lock_irqsave(&ep->ep_lk, flags); + if (ep->to_be_destroyed) { + hvnd_warn("ep being destroyed\n"); + if (ep->io_outstanding) { + hvnd_warn("ep being destroyed i/O pending waking up on %p\n", + &ep->block_event); + complete(&ep->block_event); + ep->io_outstanding = false; + } + spin_unlock_irqrestore(&ep->ep_lk, flags); + return true; + } + if (!failure) + ep->io_outstanding = true; + spin_unlock_irqrestore(&ep->ep_lk, flags); + return false; +} +#endif + +static int hvnd_get_incoming_connections(struct hvnd_ep_obj *listener, + struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx) +{ + struct hvnd_ep_obj *connector; + u64 connector_handle; + int ret; + + /* + * First handle the protocol for + * destruction - outstanding I/O. + */ + +// if (hvnd_manage_io_state(listener, false)) +// return 0; + /* + * Create a connector. + */ + connector = hvnd_setup_ep(listener->cm_id, ND_CONNECTOR, nd_dev, uctx); + if (!connector) { + hvnd_error("hvnd_setup_ep failed\n"); + ret = -ENOMEM; + goto con_alloc_err; + } + + ret = hvnd_cr_connector(nd_dev, uctx, + &connector_handle); + if (ret) { + hvnd_error("hvnd_cr_connector failed ret=%d\n", ret); + goto con_cr_err; + } + + /* + * Now get a connection if one is pending. + */ + connector->ep_handle = connector_handle; + connector->parent = listener; + + if (!ep_add_work_pending(connector)) + goto get_connection_err; + + ret = hvnd_get_connection_listener(nd_dev, uctx, + listener->ep_handle, + connector_handle, + (u64)connector); + + if (ret) { + hvnd_debug("listener_get_connection failed\n"); + ep_del_work_pending(connector); + goto get_connection_err; + } + + hvnd_acquire_uctx_ref(uctx); + listener->outstanding_handle = connector_handle; + listener->outstanding_ep = connector; + hvnd_debug("outstanding handle is %p\n", (void *)connector_handle); + return 0; + +get_connection_err: + hvnd_free_handle(nd_dev, uctx, + connector_handle, + IOCTL_ND_CONNECTOR_FREE); + +con_cr_err: + kfree(connector); +con_alloc_err: +// hvnd_manage_io_state(listener, true); + return ret; +} + +static int hvnd_create_listen(struct iw_cm_id *cm_id, int backlog) +{ + int ret = 0; + struct hvnd_dev *nd_dev; + struct hvnd_ucontext *uctx; + struct hvnd_ep_obj *ep_object; + union nd_sockaddr_inet addr; + union nd_sockaddr_inet local_addr; + u64 listener_handle; + struct sockaddr_in *laddr = (struct sockaddr_in *)&cm_id->local_addr; + union nd_sockaddr_inet og_addr; + + + nd_dev = to_nd_dev(cm_id->device); + uctx = get_uctx(nd_dev, current_pid()); + hvnd_debug("uctx is %p; pid is %d\n", uctx, current_pid()); + + if (cm_id->local_addr.ss_family != AF_INET) { + hvnd_error("cm_id->local_addr.ss_family =%d not AF_INET\n", + cm_id->local_addr.ss_family); + return -ENOSYS; + } + + /* + * If the local address is LOOPBACK or INADDR_ANY, get an an address + * to bind the listener. For now, just get the first address + * available. + */ + + if (IN_LOOPBACK(ntohl(laddr->sin_addr.s_addr)) || + (laddr->sin_addr.s_addr == INADDR_ANY)) { + + hvnd_debug("need to get an address\n"); + ret = hvnd_get_outgoing_rdma_addr(nd_dev, uctx, &og_addr); + + if (ret) { + hvnd_error("failed to get the og address\n"); + return ret; + } + + laddr->sin_addr.s_addr = og_addr.ipv4.sin_addr.s_addr; + } + + cm_id->add_ref(cm_id); + + ep_object = hvnd_setup_ep(cm_id, ND_LISTENER, nd_dev, uctx); + + if (!ep_object) { + hvnd_error("hvnd_setup_ep returned NULL\n"); + goto alloc_err; + } + + ret = hvnd_cr_listener(nd_dev, uctx, + &listener_handle); + if (ret) { + hvnd_error("hvnd_cr_listener failed ret=%d\n", ret); + goto cr_err; + } + + ep_object->ep_handle = listener_handle; + + cm_id->provider_data = ep_object; + + /* + * Now bind the listener. + * IPV4 support only. + */ + memcpy(&addr.ipv4, laddr, sizeof(struct sockaddr_in)); + + ret = hvnd_bind_listener(nd_dev, uctx, + listener_handle, + &addr); + if (ret) { + hvnd_error("hvnd_bind_listener failed ret=%d\n", ret); + goto bind_err; + } + + /* + * Now get the local address. + */ + ret = hvnd_get_addr_listener(nd_dev, uctx, + listener_handle, + &local_addr); + if (ret) { + hvnd_error("hvnd_get_addr_listener failed ret=%d\n", ret); + goto bind_err; + } + + /* + * Now put the listener in the listen mode. + */ + + ret = hvnd_listen_listener(nd_dev, uctx, + listener_handle, + backlog); + + if (ret) { + hvnd_error("hvnd_listen_listener failed ret=%d\n", ret); + goto bind_err; + } + + + /* + * Now get a pending connection if one is pending. + */ + ret = hvnd_get_incoming_connections(ep_object, nd_dev, uctx); + if (ret) { + hvnd_error("hvnd_get_incoming_connections failed ret=%d\n", + ret); + goto bind_err; + } + + hvnd_acquire_uctx_ref(uctx); + hvnd_debug("cm_id=%p\n", cm_id); + return 0; + +bind_err: + hvnd_free_handle(nd_dev, uctx, + listener_handle, + IOCTL_ND_LISTENER_FREE); +cr_err: + kfree(ep_object); +alloc_err: + cm_id->provider_data = NULL; + cm_id->rem_ref(cm_id); + return ret; +} + +static int hvnd_destroy_listen(struct iw_cm_id *cm_id) +{ + struct hvnd_dev *nd_dev; + struct hvnd_ucontext *uctx; + struct hvnd_ep_obj *ep_object; + + nd_dev = to_nd_dev(cm_id->device); + + ep_object = (struct hvnd_ep_obj *)cm_id->provider_data; + + hvnd_debug("uctx is %p\n", ep_object->uctx); + hvnd_debug("Destroying Listener cm_id=%p\n", cm_id); + uctx = ep_object->uctx; + + // make sure there is nothing in progress on this ep + ep_stop(ep_object); + + hvnd_free_handle(nd_dev, uctx, + ep_object->ep_handle, + IOCTL_ND_LISTENER_FREE); + + /* + * We may have an ouststanding connector for + * incoming connection requests; clean it up. + */ + + if (ep_object->outstanding_handle != 0) { + + // make sure there is nothing in progress on this ep + ep_stop(ep_object->outstanding_ep); + + hvnd_free_handle(nd_dev, uctx, + ep_object->outstanding_handle, + IOCTL_ND_CONNECTOR_FREE); + + + hvnd_drop_uctx_ref(nd_dev, uctx); + hvnd_destroy_ep(ep_object->outstanding_ep); + } + + /* + * Now everything should have stopped + */ + + cm_id->rem_ref(cm_id); + hvnd_destroy_ep(ep_object); + cm_id->provider_data = NULL; + hvnd_drop_uctx_ref(nd_dev, uctx); + + hvnd_debug("cm_id=%p\n", cm_id); + return 0; +} + +static void hvnd_qp_add_ref(struct ib_qp *ibqp) +{ + struct hvnd_qp *qp = to_nd_qp(ibqp); + + atomic_inc(&qp->refcnt); +} + +void hvnd_qp_rem_ref(struct ib_qp *ibqp) +{ + struct hvnd_qp *qp = to_nd_qp(ibqp); + + if (atomic_dec_and_test(&qp->refcnt)) + wake_up(&qp->wait); +} + +static DEVICE_ATTR(hw_rev, 0444, hvnd_show_rev, NULL); +static DEVICE_ATTR(fw_ver, 0444, hvnd_show_fw_ver, NULL); +static DEVICE_ATTR(hca_type, 0444, hvnd_show_hca, NULL); +static DEVICE_ATTR(board_id, 0444, hvnd_show_board, NULL); + +static struct device_attribute *hvnd_class_attributes[] = { + &dev_attr_hw_rev, + &dev_attr_fw_ver, + &dev_attr_hca_type, + &dev_attr_board_id, +}; + +int hvnd_register_device(struct hvnd_dev *dev, char *ip_addr, char *mac_addr) +{ + int ret; + + dev->ibdev.owner = THIS_MODULE; + dev->device_cap_flags = IB_DEVICE_LOCAL_DMA_LKEY | IB_DEVICE_MEM_WINDOW; + dev->ibdev.local_dma_lkey = 0; + dev->ibdev.uverbs_cmd_mask = + (1ull << IB_USER_VERBS_CMD_GET_CONTEXT) | + (1ull << IB_USER_VERBS_CMD_QUERY_DEVICE) | + (1ull << IB_USER_VERBS_CMD_QUERY_PORT) | + (1ull << IB_USER_VERBS_CMD_ALLOC_PD) | + (1ull << IB_USER_VERBS_CMD_DEALLOC_PD) | + (1ull << IB_USER_VERBS_CMD_REG_MR) | + (1ull << IB_USER_VERBS_CMD_DEREG_MR) | + (1ull << IB_USER_VERBS_CMD_CREATE_COMP_CHANNEL) | + (1ull << IB_USER_VERBS_CMD_CREATE_CQ) | + (1ull << IB_USER_VERBS_CMD_DESTROY_CQ) | + (1ull << IB_USER_VERBS_CMD_REQ_NOTIFY_CQ) | + (1ull << IB_USER_VERBS_CMD_CREATE_QP) | + (1ull << IB_USER_VERBS_CMD_MODIFY_QP) | + (1ull << IB_USER_VERBS_CMD_QUERY_QP) | + (1ull << IB_USER_VERBS_CMD_POLL_CQ) | + (1ull << IB_USER_VERBS_CMD_DESTROY_QP) | + (1ull << IB_USER_VERBS_CMD_POST_SEND) | + (1ull << IB_USER_VERBS_CMD_POST_RECV); + dev->ibdev.node_type = RDMA_NODE_RNIC; + memcpy(dev->ibdev.node_desc, HVND_NODE_DESC, sizeof(HVND_NODE_DESC)); + memcpy(&dev->ibdev.node_guid, mac_addr, 6); + dev->ibdev.phys_port_cnt = 1; //dev->nports; + dev->ibdev.num_comp_vectors = 1; + dev->ibdev.query_device = hvnd_query_device; + dev->ibdev.query_port = hvnd_query_port; + dev->ibdev.get_link_layer = hvnd_get_link_layer; + dev->ibdev.query_pkey = hvnd_query_pkey; + dev->ibdev.query_gid = hvnd_query_gid; + dev->ibdev.alloc_ucontext = hvnd_alloc_ucontext; + dev->ibdev.dealloc_ucontext = hvnd_dealloc_ucontext; + dev->ibdev.mmap = hvnd_mmap; + dev->ibdev.alloc_pd = hvnd_allocate_pd; + dev->ibdev.dealloc_pd = hvnd_deallocate_pd; + dev->ibdev.create_ah = hvnd_ah_create; + dev->ibdev.destroy_ah = hvnd_ah_destroy; + dev->ibdev.create_qp = hvnd_ib_create_qp; + dev->ibdev.modify_qp = hvnd_ib_modify_qp; + dev->ibdev.query_qp = hvnd_ib_query_qp; + dev->ibdev.destroy_qp = hvnd_destroy_qp; + dev->ibdev.create_cq = hvnd_ib_create_cq; + dev->ibdev.destroy_cq = hvnd_ib_destroy_cq; + dev->ibdev.resize_cq = hvnd_resize_cq; + dev->ibdev.poll_cq = hvnd_poll_cq; + dev->ibdev.get_dma_mr = hvnd_get_dma_mr; + dev->ibdev.reg_user_mr = hvnd_reg_user_mr; + dev->ibdev.dereg_mr = hvnd_dereg_mr; + dev->ibdev.alloc_mw = hvnd_alloc_mw; + dev->ibdev.dealloc_mw = hvnd_dealloc_mw; + dev->ibdev.attach_mcast = hvnd_multicast_attach; + dev->ibdev.detach_mcast = hvnd_multicast_detach; + dev->ibdev.process_mad = hvnd_process_mad; + dev->ibdev.req_notify_cq = hvnd_arm_cq; + dev->ibdev.post_send = hvnd_post_send; + dev->ibdev.post_recv = hvnd_post_receive; + dev->ibdev.uverbs_abi_ver = MLX4_IB_UVERBS_ABI_VERSION; + + dev->ibdev.get_port_immutable = hvnd_get_port_immutable; + + //DMA ops for mapping all possible addresses + dev->ibdev.dev.parent = &(dev->hvdev->device); + dev->ibdev.dev.dma_ops = &vmbus_dma_ops; + dev->ibdev.dev.dma_mask = (u64 *) DMA_BIT_MASK(64); + dev->ibdev.dev.coherent_dma_mask = DMA_BIT_MASK(64); + + dev->ibdev.iwcm = kmalloc(sizeof(struct iw_cm_verbs), GFP_KERNEL); + if (!dev->ibdev.iwcm) + return -ENOMEM; + + dev->ibdev.iwcm->connect = hvnd_connect; + dev->ibdev.iwcm->accept = hvnd_accept_cr; + dev->ibdev.iwcm->reject = hvnd_reject_cr; + dev->ibdev.iwcm->create_listen = hvnd_create_listen; + dev->ibdev.iwcm->destroy_listen = hvnd_destroy_listen; + dev->ibdev.iwcm->add_ref = hvnd_qp_add_ref; + dev->ibdev.iwcm->rem_ref = hvnd_qp_rem_ref; + dev->ibdev.iwcm->get_qp = hvnd_get_qp; + + strlcpy(dev->ibdev.name, "mlx4_%d", IB_DEVICE_NAME_MAX); + ret = ib_register_device(&dev->ibdev, NULL); + if (ret) { + hvnd_error("ib_register_device failed ret=%d\n", ret); + goto bail1; + } + +#if 0 + for (i = 0; i < ARRAY_SIZE(hvnd_class_attributes); ++i) { + ret = device_create_file(&dev->ibdev.dev, + hvnd_class_attributes[i]); + if (ret) { + hvnd_error("device_create_file failed ret=%d\n", ret); + goto bail2; + } + } +#endif + + dev->ib_active = true; + return 0; +//bail2: +// ib_unregister_device(&dev->ibdev); +bail1: + kfree(dev->ibdev.iwcm); + return ret; +} + +void hvnd_unregister_device(struct hvnd_dev *dev) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(hvnd_class_attributes); ++i) + device_remove_file(&dev->ibdev.dev, + hvnd_class_attributes[i]); + ib_unregister_device(&dev->ibdev); + kfree(dev->ibdev.iwcm); + ib_dealloc_device((struct ib_device *)dev); +} + +static int hvnd_try_bind_nic(unsigned char *mac, __be32 ip) +{ + int ret; + struct hvnd_dev *nd_dev = g_nd_dev; + + mutex_lock(&nd_dev->bind_mutex); + if (nd_dev->bind_complete) { + mutex_unlock(&nd_dev->bind_mutex); + return 1; + } + + memcpy(nd_dev->mac_addr, mac, 6); + *(__be32 *)(nd_dev->ip_addr) = ip; + + /* + * Bind the NIC. + */ + hvnd_info("trying to bind to IP %pI4 MAC %pM\n", + nd_dev->ip_addr, nd_dev->mac_addr); + ret = hvnd_bind_nic(nd_dev, false, nd_dev->ip_addr, nd_dev->mac_addr); + if (ret || nd_dev->bind_pkt.pkt_hdr.status) { + mutex_unlock(&nd_dev->bind_mutex); + return 1; + } + + /* if we reach here, this means bind_nic is a success */ + hvnd_error("successfully bound to IP %pI4 MAC %pM\n", + nd_dev->ip_addr, nd_dev->mac_addr); + complete(&nd_dev->addr_set); + nd_dev->bind_complete = 1; + mutex_unlock(&nd_dev->bind_mutex); + + ret = hvnd_register_device(nd_dev, nd_dev->ip_addr, nd_dev->mac_addr); + + if (!ret) + return 0; + + hvnd_error("hvnd_register_device failed ret=%d\n", ret); + + /* roll back all allocated resources on error */ + iounmap(nd_dev->mmio_virt); + release_resource(&nd_dev->mmio_resource); + + vmbus_close(nd_dev->hvdev->channel); + ib_dealloc_device((struct ib_device *)nd_dev); + + return 1; +} + +static void hvnd_inetaddr_event_up(unsigned long event, struct in_ifaddr *ifa) +{ + hvnd_try_bind_nic(ifa->ifa_dev->dev->dev_addr, ifa->ifa_address); +} + +static int hvnd_inetaddr_event(struct notifier_block *notifier, + unsigned long event, void *ptr) +{ + struct in_ifaddr *ifa = ptr; + + switch (event) { + case NETDEV_UP: + hvnd_inetaddr_event_up(event, ifa); + break; + default: + hvnd_debug("Received inetaddr event %lu\n", event); + } + + return NOTIFY_DONE; +} + +static struct notifier_block hvnd_inetaddr_notifier = { + .notifier_call = hvnd_inetaddr_event, +}; + +static int start_bind_nic(void) +{ + struct net_device *dev; + struct in_device *idev; + struct in_ifaddr *ifa; + + register_inetaddr_notifier(&hvnd_inetaddr_notifier); + + rtnl_lock(); + for_each_netdev(&init_net, dev) { + idev = in_dev_get(dev); + if (!idev) + continue; + for (ifa = (idev)->ifa_list; + ifa && !(ifa->ifa_flags&IFA_F_SECONDARY); + ifa = ifa->ifa_next) { + hvnd_try_bind_nic(dev->dev_addr, ifa->ifa_address); + } + } + rtnl_unlock(); + + return 0; +} + +static int hvnd_probe(struct hv_device *dev, + const struct hv_vmbus_device_id *dev_id) +{ + struct hvnd_dev *nd_dev; + int ret = 0; + + hvnd_debug("hvnd starting\n"); + + nd_dev = (struct hvnd_dev *)ib_alloc_device(sizeof(struct hvnd_dev)); + if (!nd_dev) { + ret = -ENOMEM; + goto err_out0; + } + + nd_dev->hvdev = dev; + /* + * We are going to masquerade as MLX4 device; + * Set the vendor and device ID accordingly. + */ + dev->vendor_id = 0x15b3; //Mellanox + dev->device_id = 0x1003; //Mellanox HCA + INIT_LIST_HEAD(&nd_dev->listentry); + spin_lock_init(&nd_dev->uctxt_lk); + nd_dev->ib_active = false; + + /* + * Initialize the state for the id table. + */ + spin_lock_init(&nd_dev->id_lock); + idr_init(&nd_dev->cqidr); + idr_init(&nd_dev->qpidr); + idr_init(&nd_dev->mmidr); + idr_init(&nd_dev->irpidr); + idr_init(&nd_dev->uctxidr); + + atomic_set(&nd_dev->open_cnt, 0); + + sema_init(&nd_dev->query_pkt_sem, 1); + + ret = vmbus_open(dev->channel, HVND_RING_SZ, HVND_RING_SZ, NULL, 0, + hvnd_callback, dev); + + if (ret) { + hvnd_error("vmbus_open failed ret=%d\n", ret); + goto err_out1; + } + + hv_set_drvdata(dev, nd_dev); + + ret = hvnd_negotiate_version(nd_dev); + + if (ret) { + hvnd_error("hvnd_negotiate_version failed ret=%d\n", ret); + goto err_out2; + } + + /* + * Register resources with the host. + */ + ret = hvnd_init_resources(nd_dev); + if (ret) { + hvnd_error("hvnd_init_resources failed ret=%d\n", ret); + goto err_out2; + } + + /* + * Try to bind every NIC to ND channel, + * ND host will only return success for the correct one + */ + nd_dev->bind_complete = 0; + mutex_init(&nd_dev->bind_mutex); + init_completion(&nd_dev->addr_set); + + g_nd_dev = nd_dev; + start_bind_nic(); + + return 0; + +err_out2: + vmbus_close(dev->channel); + +err_out1: + ib_dealloc_device((struct ib_device *)nd_dev); + +err_out0: + return ret; +} + +static int hvnd_remove(struct hv_device *dev) +{ + struct hvnd_dev *nd_dev = hv_get_drvdata(dev); + + unregister_inetaddr_notifier(&hvnd_inetaddr_notifier); + hvnd_bind_nic(nd_dev, true, nd_dev->ip_addr, nd_dev->mac_addr); + hvnd_unregister_device(nd_dev); + vmbus_close(dev->channel); + iounmap(nd_dev->mmio_virt); + release_resource(&nd_dev->mmio_resource); + return 0; +} + +static const struct hv_vmbus_device_id id_table[] = { + /* VMBUS RDMA class guid */ + /* 8c2eaf3d-32a7-4b09-ab99-bd1f1c86b501 */ + { HV_ND_GUID, }, + { }, +}; + +MODULE_DEVICE_TABLE(vmbus, id_table); + +static struct hv_driver hvnd_drv = { + .name = "hv_guest_rdma", + .id_table = id_table, + .probe = hvnd_probe, + .remove = hvnd_remove, +}; + + +static int __init init_hvnd_drv(void) +{ + + pr_info("Registered HyperV networkDirect Driver\n"); + hvnd_addr_init(); + return vmbus_driver_register(&hvnd_drv); +} + +static void exit_hvnd_drv(void) +{ + pr_info("De-Registered HyperV networkDirect Driver\n"); + hvnd_addr_deinit(); + vmbus_driver_unregister(&hvnd_drv); +} + + +module_init(init_hvnd_drv); +module_exit(exit_hvnd_drv); + +MODULE_DESCRIPTION("Hyper-V NetworkDirect Driver"); +MODULE_LICENSE("GPL"); +MODULE_VERSION("4.1.3"); diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/user.h b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/user.h new file mode 100644 index 000000000000..07e6769ef43b --- /dev/null +++ b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/user.h @@ -0,0 +1,107 @@ +/* + * Copyright (c) 2007 Cisco Systems, Inc. All rights reserved. + * Copyright (c) 2007, 2008 Mellanox Technologies. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + */ + +#ifndef MLX4_IB_USER_H +#define MLX4_IB_USER_H + +#include <linux/types.h> + +/* + * Increment this value if any changes that break userspace ABI + * compatibility are made. + */ + +#define MLX4_IB_UVERBS_NO_DEV_CAPS_ABI_VERSION 3 +#define MLX4_IB_UVERBS_ABI_VERSION 4 + +/* + * Make sure that all structs defined in this file remain laid out so + * that they pack the same way on 32-bit and 64-bit architectures (to + * avoid incompatibility between 32-bit userspace and 64-bit kernels). + * In particular do not use pointer types -- pass pointers in __u64 + * instead. + */ + +struct mlx4_ib_alloc_ucontext_resp_v3 { + __u32 qp_tab_size; + __u16 bf_reg_size; + __u16 bf_regs_per_page; +}; + +struct mlx4_ib_alloc_ucontext_resp { + __u32 dev_caps; + __u32 qp_tab_size; + __u16 bf_reg_size; + __u16 bf_regs_per_page; + __u32 cqe_size; +}; + +struct mlx4_ib_alloc_pd_resp { + __u32 pdn; + __u32 reserved; +}; + +struct mlx4_ib_create_cq { + __u64 buf_addr; + __u64 db_addr; +}; + +struct mlx4_ib_create_cq_resp { + __u32 cqn; + __u32 reserved; +}; + +struct mlx4_ib_resize_cq { + __u64 buf_addr; +}; + +struct mlx4_ib_create_srq { + __u64 buf_addr; + __u64 db_addr; +}; + +struct mlx4_ib_create_srq_resp { + __u32 srqn; + __u32 reserved; +}; + +struct mlx4_ib_create_qp { + __u64 buf_addr; + __u64 db_addr; + __u8 log_sq_bb_count; + __u8 log_sq_stride; + __u8 sq_no_prefetch; + __u8 reserved[5]; +}; + +#endif /* MLX4_IB_USER_H */ diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/vmbus_rdma.c b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/vmbus_rdma.c new file mode 100644 index 000000000000..daf68dcccb1b --- /dev/null +++ b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/vmbus_rdma.c @@ -0,0 +1,3074 @@ +/* + * Copyright (c) 2014, Microsoft Corporation. + * + * Author: + * K. Y. Srinivasan <kys@microsoft.com> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or + * NON INFRINGEMENT. See the GNU General Public License for more + * details. + * + * Bug fixes/enhancements: Long Li <longli@microsoft.com> + */ + +#include <linux/completion.h> +#include <linux/module.h> +#include <linux/errno.h> +#include <linux/hyperv.h> +#include <linux/efi.h> +#include <linux/slab.h> +#include <linux/cred.h> +#include <linux/sched.h> +#include <linux/types.h> +#include <linux/scatterlist.h> +#include <rdma/ib_umem.h> +#include <rdma/ib_user_verbs.h> +#include <asm-generic/delay.h> + +#include "vmbus_rdma.h" + +/* + * We only have a single rdma device on the host; + * have a single receive buffer. + */ + + +static char hvnd_recv_buffer[PAGE_SIZE * 4]; + +static atomic_t irp_local_hdl; + +char *hvnd_get_op_name(int ioctl) +{ + switch (ioctl) { + case IOCTL_ND_PROVIDER_INIT: + return "IOCTL_ND_PROVIDER_INIT\n"; + case IOCTL_ND_PROVIDER_BIND_FILE: + return "IOCTL_ND_PROVIDER_BIND_FILE\n"; + case IOCTL_ND_ADAPTER_OPEN: + return "IOCTL_ND_ADAPTER_OPEN\n"; + + case IOCTL_ND_ADAPTER_CLOSE: + return "IOCTL_ND_ADAPTER_CLOSE\n"; + + case IOCTL_ND_ADAPTER_QUERY: + return "IOCTL_ND_ADAPTER_QUERY\n"; + + case IOCTL_ND_PD_CREATE: + return "IOCTL_ND_PD_CREATE\n"; + + case IOCTL_ND_PD_FREE: + return "IOCTL_ND_PD_FREE\n"; + + case IOCTL_ND_CQ_CREATE: + return "IOCTL_ND_CQ_CREATE\n"; + + case IOCTL_ND_CQ_FREE: + return "IOCTL_ND_CQ_FREE\n"; + case IOCTL_ND_CQ_CANCEL_IO: + return "IOCTL_ND_CQ_CANCEL_IO\n"; + case IOCTL_ND_CQ_GET_AFFINITY: + return "IOCTL_ND_CQ_GET_AAFINITY\n"; + case IOCTL_ND_CQ_MODIFY: + return "IOCTL_ND_CQ_MODIFY\n"; + + case IOCTL_ND_CQ_NOTIFY: + return "IOCTL_ND_CQ_NOTIFY\n"; + + + case IOCTL_ND_LISTENER_CREATE: + return "IOCTL_ND_LISTENER_CREATE\n"; + + case IOCTL_ND_LISTENER_FREE: + return "IOCTL_ND_LISTENER_FREE\n"; + + case IOCTL_ND_QP_FREE: + return "IOCTL_ND_QP_FREE\n"; + + case IOCTL_ND_CONNECTOR_CANCEL_IO: + return "IOCTL_ND_CONNECTOR_CANCEL_IO\n"; + + case IOCTL_ND_LISTENER_CANCEL_IO: + return "IOCTL_ND_LISTENER_CANCEL_IO\n"; + + case IOCTL_ND_LISTENER_BIND: + return "IOCTL_ND_LISTENER_BIND\n"; + + case IOCTL_ND_LISTENER_LISTEN: + return "IOCTL_ND_LISTENER_LISTEN\n"; + + case IOCTL_ND_LISTENER_GET_ADDRESS: + return "IOCTL_ND_LISTENER_GET_ADDRESS\n"; + + case IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST: + return "IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST\n"; + + + + case IOCTL_ND_CONNECTOR_CREATE: + return "IOCTL_ND_CONNECTOR_CREATE\n"; + + case IOCTL_ND_CONNECTOR_FREE: + return "IOCTL_ND_CONNECTOR_FREE\n"; + + case IOCTL_ND_CONNECTOR_BIND: + return "IOCTL_ND_CONNECTOR_BIND\n"; + + // KYS: ALERT: ASYNCH Operation + case IOCTL_ND_CONNECTOR_CONNECT: + return "IOCTL_ND_CONNECTOR_CONNECT\n"; + + case IOCTL_ND_CONNECTOR_COMPLETE_CONNECT: + return "IOCTL_ND_CONNECTOR_COMPLETE_CONNECT\n"; + + // KYS: ALERT: ASYNCH Operation + case IOCTL_ND_CONNECTOR_ACCEPT: + return "IOCTL_ND_CONNECTOR_ACCEPT\n"; + + case IOCTL_ND_CONNECTOR_REJECT: + return "IOCTL_ND_CONNECTOR_REJECT\n"; + + case IOCTL_ND_CONNECTOR_GET_READ_LIMITS: + return "IOCTL_ND_CONNECTOR_GET_READ_LIMITS\n"; + + case IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA: + return "IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA\n"; + + case IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS: + return "IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS\n"; + + case IOCTL_ND_CONNECTOR_GET_ADDRESS: + return "IOCTL_ND_CONNECTOR_GET_ADDRESS\n"; + + // KYS: ALERT: ASYNCH Operation + case IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT: + return "IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT\n"; + + // KYS: ALERT: ASYNCH Operation + case IOCTL_ND_CONNECTOR_DISCONNECT: + return "IOCTL_ND_CONNECTOR_DISCONNECT\n"; + + + + case IOCTL_ND_QP_CREATE: + return "IOCTL_ND_QP_CREATE\n"; + + case IOCTL_ND_MR_CREATE: + return "IOCTL_ND_MR_CREATE\n"; + + case IOCTL_ND_MR_FREE: + return "IOCTL_ND_MR_FREE\n"; + case IOCTL_ND_MR_REGISTER: + return "IOCTL_ND_MR_REGISTER\n"; + case IOCTL_ND_MR_DEREGISTER: + return "IOCTL_ND_MR_DEREGISTER\n"; + case IOCTL_ND_MR_CANCEL_IO: + return "IOCTL_ND_MR_CANCEL_IO\n"; + case IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST: + return "IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST\n"; + case IOCTL_ND_QP_FLUSH: + return "IOCTL_ND_QP_FLUSH\n"; + + default: + return "Unknown IOCTL\n"; + } +} +int get_irp_handle(struct hvnd_dev *nd_dev, u32 *local, void *irp_ctx) +{ + unsigned int local_handle; + int ret; + + local_handle = atomic_inc_return(&irp_local_hdl); + *local = local_handle; + + /* + * Now asssociate the local handle with the pointer. + */ + ret = insert_handle(nd_dev, &nd_dev->irpidr, irp_ctx, local_handle); + hvnd_debug("irp_ctx=%p local_handle=%u\n", irp_ctx, local_handle); + + if (ret) { + hvnd_error("insert_handle failed ret=%d\n", ret); + return ret; + } + + return 0; +} + +void put_irp_handle(struct hvnd_dev *nd_dev, u32 irp) +{ + remove_handle(nd_dev, &nd_dev->irpidr, irp); + +} + +static void init_pfn(u64 *pfn, void *addr, u32 length) +{ + int i; + u32 offset = offset_in_page(addr); + u32 num_pfn = DIV_ROUND_UP(offset + length, PAGE_SIZE); + + for (i = 0; i < num_pfn; i++) + pfn[i] = virt_to_phys((u8 *)addr + + (PAGE_SIZE * i)) >> PAGE_SHIFT; +} + + +static void user_va_init_pfn(u64 *pfn, struct ib_umem *umem) +{ + int entry; + struct scatterlist *sg; + int i = 0; + + for_each_sg(umem->sg_head.sgl, sg, umem->nmap, entry) { + pfn[i++] = page_to_pfn(sg_page(sg)); + } +} + +static u32 get_local_handle(void *p) +{ + u64 val = (unsigned long)p; + + return (u32)val; +} + +static int hvnd_send_pg_buffer(struct hvnd_dev *nd_dev, + struct vmbus_packet_mpb_array *desc, + u32 desc_size, + void *buffer, + u32 bufferlen, u64 cookie) +{ + int ret; + int t; + struct hvnd_cookie hvnd_cookie; + + hvnd_cookie.pkt = (void *)cookie; + init_completion(&hvnd_cookie.host_event); + + ret = vmbus_sendpacket_mpb_desc(nd_dev->hvdev->channel, + desc, + desc_size, + buffer, bufferlen, + (u64)(&hvnd_cookie)); + + if (ret) { + hvnd_error("vmbus_sendpacket_mpb_desc failed ret=%d\n", ret); + goto err; + } + + t = wait_for_completion_timeout(&hvnd_cookie.host_event, 500*HZ); + + if (t == 0) { + hvnd_error("wait_for_completion_timeout timed out\n"); + ret = -ETIMEDOUT; + } + +err: + return ret; +} + +static int hvnd_send_packet(struct hvnd_dev *nd_dev, void *buffer, + u32 bufferlen, u64 cookie, bool block) +{ + int ret; + int t; + struct hvnd_cookie hvnd_cookie; + + hvnd_cookie.pkt = (void *)cookie; + init_completion(&hvnd_cookie.host_event); + + ret = vmbus_sendpacket(nd_dev->hvdev->channel, buffer, bufferlen, + (u64)(&hvnd_cookie), VM_PKT_DATA_INBAND, + VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED); + + if (ret) { + hvnd_error("vmbus_send pkt failed: %d\n", ret); + goto err; + } + + if (!block) + return ret; + + t = wait_for_completion_timeout(&hvnd_cookie.host_event, 500*HZ); + + if (t == 0) { + hvnd_error("wait_for_completion_timeout timed out\n"); + ret = -ETIMEDOUT; + } + +err: + return ret; +} + +static int hvnd_send_pgbuf_ioctl_pkt(struct hvnd_dev *nd_dev, + struct vmbus_packet_mpb_array *desc, + u32 desc_size, + struct ndv_packet_hdr_control_1 *hdr, + u32 pkt_size, u64 cookie) +{ + int ret; + int ioctl; + + ioctl = hdr->io_cntrl_code; + + + ret = hvnd_send_pg_buffer(nd_dev, desc, desc_size, + hdr, pkt_size, cookie); + + if (ret) + return ret; + + if (hdr->pkt_hdr.status != 0) { + hvnd_error("IOCTL: %s failed; status is %x\n", + hvnd_get_op_name(ioctl), + hdr->pkt_hdr.status); + return -EINVAL; + } + + switch (hdr->io_status) { + case STATUS_SUCCESS: + case STATUS_PENDING: + return 0; + + default: + hvnd_error("IOCTL: %s failed io status is %x\n", + hvnd_get_op_name(ioctl), hdr->io_status); + return -EINVAL; + } +} + +int hvnd_send_ioctl_pkt(struct hvnd_dev *nd_dev, + struct ndv_packet_hdr_control_1 *hdr, + u32 pkt_size, u64 cookie) +{ + int ret; + int ioctl; + bool block; + + block = (hdr->irp_handle.val64 == 0) ? true : false; + + + ioctl = hdr->io_cntrl_code; + + ret = hvnd_send_packet(nd_dev, hdr, pkt_size, cookie, block); + + if (ret) + return ret; + + if (!block) + return ret; + + if (hdr->pkt_hdr.status != 0) { + hvnd_error("IOCTL: %s failed; status is %x\n", + hvnd_get_op_name(ioctl), hdr->pkt_hdr.status); + return -EINVAL; + } + + switch (hdr->io_status) { + case STATUS_SUCCESS: + case STATUS_PENDING: + return 0; + + default: + hvnd_warn("IOCTL: %s failed io status is %x\n", + hvnd_get_op_name(ioctl), hdr->io_status); + return -EINVAL; + } +} + +void hvnd_init_hdr(struct ndv_packet_hdr_control_1 *hdr, + u32 data_sz, u32 local, u32 remote, + u32 ioctl_code, + u32 ext_data_sz, u32 ext_data_offset, + u64 irp_handle) + +{ + int pkt_type; + + pkt_type = NDV_PKT_ID1_CONTROL; + NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); + hdr->pkt_hdr.packet_type = pkt_type; + hdr->pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); + hdr->pkt_hdr.data_sz = data_sz; + + hdr->pkt_hdr.status = 0; + + hdr->file_handle.local = local; + hdr->file_handle.remote = remote; + hdr->irp_handle.val64 = irp_handle; + + hdr->io_cntrl_code = ioctl_code; + hdr->output_buf_sz = data_sz - ext_data_sz; + hdr->input_buf_sz = data_sz - ext_data_sz; + + hdr->input_output_buf_offset = 0; + + hdr->extended_data.size = ext_data_sz; + hdr->extended_data.offset = ext_data_offset; +} + + +int hvnd_create_file(struct hvnd_dev *nd_dev, void *uctx, + struct ndv_pkt_hdr_create_1 *create, u32 file_flags) +{ + int ret; + int pkt_type; + + + pkt_type = NDV_PKT_ID1_CREATE; + NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); + create->pkt_hdr.packet_type = pkt_type; + create->pkt_hdr.hdr_sz = sizeof(struct ndv_pkt_hdr_create_1); + create->pkt_hdr.data_sz = 0; + + create->handle.local = get_local_handle(uctx); + create->access_mask = STANDARD_RIGHTS_ALL; + create->open_options = OPEN_EXISTING; + create->file_attributes = FILE_ATTRIBUTE_NORMAL | file_flags; + create->share_access = FILE_SHARE_ALL; + + ret = hvnd_send_packet(nd_dev, create, + sizeof(struct ndv_pkt_hdr_create_1), + (unsigned long)create, true); + return ret; +} + +int hvnd_cleanup_file(struct hvnd_dev *nd_dev, u32 local, u32 remote) +{ + int ret; + int pkt_type; + struct ndv_pkt_hdr_cleanup_1 cleanup_pkt; + + + pkt_type = NDV_PKT_ID1_CLEANUP; + NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); + + cleanup_pkt.pkt_hdr.packet_type = pkt_type; + cleanup_pkt.pkt_hdr.hdr_sz = sizeof(struct ndv_pkt_hdr_create_1); + cleanup_pkt.pkt_hdr.data_sz = 0; + + cleanup_pkt.handle.local = local; + cleanup_pkt.handle.remote = remote; + + ret = hvnd_send_packet(nd_dev, &cleanup_pkt, + sizeof(struct ndv_pkt_hdr_create_1), + (unsigned long)&cleanup_pkt, true); + return ret; +} + + +static int hvnd_do_ioctl(struct hvnd_dev *nd_dev, u32 ioctl, + struct pkt_nd_provider_ioctl *pkt, + union ndv_context_handle *hdr_handle, + struct nd_handle *ioctl_handle, + u8 *buf, u32 buf_len, bool c_in, bool c_out, u64 irp_val) +{ + int ret; + int pkt_type; + + pkt_type = NDV_PKT_ID1_CONTROL; + NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); + + pkt->hdr.pkt_hdr.packet_type = pkt_type; + pkt->hdr.pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); + pkt->hdr.pkt_hdr.data_sz = (sizeof(struct pkt_nd_provider_ioctl) - + sizeof(struct ndv_packet_hdr_control_1)); + + pkt->hdr.file_handle.local = hdr_handle->local; + pkt->hdr.file_handle.remote = hdr_handle->remote; + hvnd_debug("create handle local: %x remote: %x\n", + hdr_handle->local, hdr_handle->remote); + + pkt->hdr.irp_handle.val64 = irp_val; + + pkt->hdr.io_cntrl_code = ioctl; + pkt->hdr.output_buf_sz = sizeof(struct nd_ioctl); + pkt->hdr.input_buf_sz = sizeof(struct nd_ioctl); + pkt->hdr.input_output_buf_offset = 0; + memset(&pkt->ioctl.handle, 0, sizeof(struct nd_handle)); + pkt->ioctl.handle.version = ND_VERSION_1; + + switch (ioctl) { + case IOCTL_ND_PROVIDER_BIND_FILE: + pkt->ioctl.handle.handle = ioctl_handle->handle; + break; + default: + break; + }; + + /* + * Copy the input buffer, if needed. + */ + + if (c_in && (buf != NULL)) + memcpy(pkt->ioctl.raw_buffer, buf, buf_len); + + ret = hvnd_send_packet(nd_dev, pkt, + sizeof(struct pkt_nd_provider_ioctl), + (unsigned long)pkt, true); + + if (ret) + return ret; + + if (c_out && (buf != NULL)) + memcpy(buf, pkt->ioctl.raw_buffer, buf_len); + + return ret; +} + +static int idr_callback(int id, void *p, void *data) +{ + if (p == data) + return id; + return 0; +} + +void remove_uctx(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx) +{ + int pid = current_pid(); + unsigned long flags; + int id; + + if (get_uctx(nd_dev, pid) == uctx) + remove_handle(nd_dev, &nd_dev->uctxidr, pid); + else { + hvnd_warn("uctx %p not found on pid %d, doing a idr search\n", + uctx, current_pid()); + + spin_lock_irqsave(&nd_dev->id_lock, flags); + id = idr_for_each(&nd_dev->uctxidr, idr_callback, uctx); + spin_unlock_irqrestore(&nd_dev->id_lock, flags); + + if (id) + remove_handle(nd_dev, &nd_dev->uctxidr, id); + else { + hvnd_error("uctx %p not found in idr table\n", uctx); + return; + } + } + + kfree(uctx); +} + +int hvnd_close_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx) +{ + int ret; + + /* + * First close the adaptor. + */ + + ret = hvnd_free_handle(nd_dev, uctx, + uctx->adaptor_hdl, + IOCTL_ND_ADAPTER_CLOSE); + + if (ret) + hvnd_error("Adaptor close failed; ret is %x\n", ret); + + /* + * Now close the two files we created. + */ + + ret = hvnd_cleanup_file(nd_dev, uctx->file_handle_ovl.local, + uctx->file_handle_ovl.remote); + + if (ret) + hvnd_error("file cleanup failed; ret is %x\n", ret); + + ret = hvnd_cleanup_file(nd_dev, uctx->file_handle.local, + uctx->file_handle.remote); + + if (ret) + hvnd_error("File cleanup failed; ret is %x\n", ret); + + /* + * Remove the uctx from the ID table. + */ + remove_uctx(nd_dev, uctx); + + return 0; +} + +int hvnd_open_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx) +{ + int ret; + struct pkt_nd_provider_ioctl *pr_init = &uctx->pr_init_pkt; + int pkt_type; + struct nd_handle ioctl_handle; + struct pkt_nd_open_adapter *pr_o_adap = &uctx->o_adap_pkt; + + ret = hvnd_create_file(nd_dev, uctx, &uctx->create_pkt, 0); + if (ret) { + hvnd_error("hvnd_create_file failed ret=%d\n", ret); + goto error_cr; + } + + if (uctx->create_pkt.pkt_hdr.status != 0) { + hvnd_error("create File failed; status is %d\n", + uctx->create_pkt.pkt_hdr.status); + ret = -EINVAL; + goto error_cr; + } + + uctx->file_handle.local = uctx->create_pkt.handle.local; + uctx->file_handle.remote = uctx->create_pkt.handle.remote; + hvnd_debug("INITIALIZE PROVIDER\n"); + /* + * Now Initialize the Provider. + */ + ioctl_handle.handle = 0; + ret = hvnd_do_ioctl(nd_dev, IOCTL_ND_PROVIDER_INIT, pr_init, + &uctx->create_pkt.handle, + &ioctl_handle, NULL, 0, false, false, 0); + + if (ret) { + ret = -EINVAL; + goto error_pr_init; + } + + if (pr_init->hdr.pkt_hdr.status != 0) { + hvnd_error("Provider INIT failed; status is %d\n", + pr_init->hdr.pkt_hdr.status); + ret = -EINVAL; + goto error_pr_init; + } + + if (pr_init->hdr.io_status != 0) { + hvnd_error("Provider INIT failed; io status is %d\n", + pr_init->hdr.io_status); + ret = -EINVAL; + goto error_pr_init; + } + + /* + * Now create the overlap file. + */ + + hvnd_debug("CREATE OVERLAP FILE\n"); + ret = hvnd_create_file(nd_dev, uctx, &uctx->create_pkt_ovl, + FILE_FLAG_OVERLAPPED); + if (ret) { + hvnd_error("hvnd_create_file failed ret=%d\n", ret); + goto error_pr_init; + } + + if (uctx->create_pkt_ovl.pkt_hdr.status != 0) { + hvnd_error("create Overlap File failed; status is %d\n", + uctx->create_pkt_ovl.pkt_hdr.status); + ret = -EINVAL; + goto error_pr_init; + } + uctx->file_handle_ovl.local = uctx->create_pkt_ovl.handle.local; + uctx->file_handle_ovl.remote = uctx->create_pkt_ovl.handle.remote; + + /* + * Now bind the two file handles together. + */ + + hvnd_debug("BIND FILE IOCTL remote handle: %d local handle: %d\n", + uctx->create_pkt_ovl.handle.remote, + uctx->create_pkt_ovl.handle.local); + + ioctl_handle.handle = uctx->create_pkt_ovl.handle.val64; + ret = hvnd_do_ioctl(nd_dev, IOCTL_ND_PROVIDER_BIND_FILE, pr_init, + &uctx->create_pkt.handle, + &ioctl_handle, NULL, 0, false, false, 0); + + if (ret) { + ret = -EINVAL; + goto error_file_bind; + } + if (pr_init->hdr.pkt_hdr.status != 0) { + hvnd_error("Provider File bind failed; status is %d\n", + pr_init->hdr.pkt_hdr.status); + ret = -EINVAL; + goto error_file_bind; + } + if (pr_init->hdr.io_status != 0) { + hvnd_error("Provider INIT failed; io status is %d\n", + pr_init->hdr.io_status); + ret = -EINVAL; + goto error_file_bind; + } + + /* + * Now open the adaptor. + */ + + hvnd_debug("OPENING THE ADAPTOR\n"); + + pkt_type = NDV_PKT_ID1_CONTROL; + NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); + pr_o_adap->hdr.pkt_hdr.packet_type = pkt_type; + pr_o_adap->hdr.pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); + pr_o_adap->hdr.pkt_hdr.data_sz = sizeof(struct pkt_nd_open_adapter) - + sizeof(struct ndv_packet_hdr_control_1); + + pr_o_adap->hdr.pkt_hdr.status = 0; + + hvnd_debug("hdr sz is %d\n", pr_o_adap->hdr.pkt_hdr.hdr_sz); + hvnd_debug("data sz is %d\n", pr_o_adap->hdr.pkt_hdr.data_sz); + + pr_o_adap->hdr.file_handle.local = uctx->create_pkt.handle.local; + pr_o_adap->hdr.file_handle.remote = uctx->create_pkt.handle.remote; + hvnd_debug("create handle local is %x\n", + uctx->create_pkt.handle.local); + hvnd_debug("create handle remote is %x\n", + uctx->create_pkt.handle.remote); + pr_o_adap->hdr.irp_handle.val64 = 0; + + pr_o_adap->hdr.io_cntrl_code = IOCTL_ND_ADAPTER_OPEN; + pr_o_adap->hdr.output_buf_sz = pr_o_adap->hdr.pkt_hdr.data_sz - + sizeof(struct extended_data_oad); + pr_o_adap->hdr.input_buf_sz = pr_o_adap->hdr.pkt_hdr.data_sz - + sizeof(struct extended_data_oad); + + hvnd_debug("output buf sz is %d\n", pr_o_adap->hdr.output_buf_sz); + hvnd_debug("input buf sz is %d\n", pr_o_adap->hdr.input_buf_sz); + hvnd_debug("packet size is %d\n", + (int)sizeof(struct pkt_nd_open_adapter)); + + pr_o_adap->hdr.input_output_buf_offset = 0; + + + pr_o_adap->hdr.extended_data.size = sizeof(struct extended_data_oad); + pr_o_adap->hdr.extended_data.offset = + offsetof(struct pkt_nd_open_adapter, ext_data) - + sizeof(struct ndv_packet_hdr_control_1); + + hvnd_debug("size of the extended data size: %d\n", + (int)sizeof(struct extended_data_oad)); + hvnd_debug("offset of extended data: %d\n", + pr_o_adap->hdr.extended_data.offset); + + /* + * Now fill out the ioctl section. + */ + + pr_o_adap->ioctl.input.version = ND_VERSION_1; + pr_o_adap->ioctl.input.ce_mapping_cnt = + RTL_NUMBER_OF(pr_o_adap->mappings.ctx_input.mappings); + + hvnd_debug("ce_mapping cnt is %d\n", + pr_o_adap->ioctl.input.ce_mapping_cnt); + + pr_o_adap->ioctl.input.cb_mapping_offset = sizeof(union oad_ioctl); + hvnd_debug("cb_mapping offset is %d\n", + pr_o_adap->ioctl.input.cb_mapping_offset); + pr_o_adap->ioctl.input.adapter_id = (u64)nd_dev; + + pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_UAR].map_type = ND_MAP_IOSPACE; + pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_UAR].map_io_space.cache_type = ND_NON_CACHED; + pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_UAR].map_io_space.cb_length = 4096; + + pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_BF].map_type = ND_MAP_IOSPACE; + pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_BF].map_io_space.cache_type = ND_WRITE_COMBINED; + pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_BF].map_io_space.cb_length = 4096; + + /* + * Fill in the extended data. + */ + pr_o_adap->ext_data.cnt = IBV_GET_CONTEXT_MAPPING_MAX; + + ret = hvnd_send_packet(nd_dev, pr_o_adap, + sizeof(struct pkt_nd_open_adapter), + (unsigned long)pr_o_adap, true); + if (ret) { + ret = -EINVAL; + goto error_file_bind; + } + + if (pr_o_adap->hdr.pkt_hdr.status != 0) { + hvnd_error("Open adaptor failed; status is %d\n", + pr_o_adap->hdr.pkt_hdr.status); + ret = -EINVAL; + goto error_file_bind; + } + + if (pr_o_adap->hdr.io_status != 0) { + hvnd_error("Open adaptor failed;io status is %d\n", + pr_o_adap->hdr.io_status); + ret = -EINVAL; + goto error_file_bind; + } + + /* + * Copy the necessary response from the host. + */ + + uctx->adaptor_hdl = pr_o_adap->ioctl.resrc_desc.handle; + + + hvnd_debug("adaptor handle: %p\n", (void *)uctx->adaptor_hdl); + + uctx->uar_base = + pr_o_adap->mappings.ctx_output.mapping_results[IBV_GET_CONTEXT_UAR].info; + hvnd_debug("uar base: %p\n", (void *)uctx->uar_base); + + uctx->bf_base = + pr_o_adap->mappings.ctx_output.mapping_results[IBV_GET_CONTEXT_BF].info; + hvnd_debug("bf base: %p\n", (void *)uctx->bf_base); + + uctx->bf_buf_size = + pr_o_adap->mappings.ctx_output.bf_buf_size; + hvnd_debug("bf buf size: %d\n", uctx->bf_buf_size); + + uctx->bf_offset = + pr_o_adap->mappings.ctx_output.bf_offset; + hvnd_debug("bf offset: %d\n", uctx->bf_offset); + + uctx->cqe_size = + pr_o_adap->mappings.ctx_output.cqe_size; + hvnd_debug("cqe size: %d\n", uctx->cqe_size); + + uctx->max_qp_wr = + pr_o_adap->mappings.ctx_output.max_qp_wr; + hvnd_debug("max qp wr: %d\n", uctx->max_qp_wr); + + uctx->max_sge = + pr_o_adap->mappings.ctx_output.max_sge; + hvnd_debug("max sge: %d\n", uctx->max_sge); + + uctx->max_cqe = + pr_o_adap->mappings.ctx_output.max_cqe; + hvnd_debug("max cqe: %d\n", uctx->max_cqe); + + uctx->num_qps = + pr_o_adap->mappings.ctx_output.qp_tab_size; + hvnd_debug("num qps: %d\n", uctx->num_qps); + + /* + * Now query the adaptor and stash away the adaptor info. + */ + + ret = hvnd_query_adaptor(nd_dev, uctx); + if (ret) { + hvnd_error("Query Adaptor failed; ret is %d\n", ret); + goto query_err; + } + + return ret; + +query_err: + hvnd_free_handle(nd_dev, uctx, + uctx->adaptor_hdl, + IOCTL_ND_ADAPTER_CLOSE); + + hvnd_error("Open Adaptor Failed!!\n"); + +error_file_bind: + hvnd_cleanup_file(nd_dev, uctx->file_handle_ovl.local, + uctx->file_handle_ovl.remote); + +error_pr_init: + hvnd_cleanup_file(nd_dev, uctx->file_handle.local, + uctx->file_handle.remote); + +error_cr: + if (get_uctx(nd_dev, current_pid()) != NULL) + remove_handle(nd_dev, &nd_dev->uctxidr, current_pid()); + + return ret; +} + +int hvnd_create_cq(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + struct hvnd_cq *cq) +{ + int ret; + struct pkt_nd_create_cq *pkt; + int num_pfn, num_db_pfn, num_sn_pfn; + int cq_pkt_size; + unsigned int cq_buf_size, offset; + u32 ext_data_sz; + u32 ext_data_offset; + + /* + * Now create CQ. + * First compute the number of PFNs we need to accomodate: + * One each for door bell and arm_sn and pages in cq buffer. + */ + cq_buf_size = (cq->entries * uctx->cqe_size); + offset = offset_in_page(cq->cq_buf); + num_pfn = DIV_ROUND_UP(offset + cq_buf_size, PAGE_SIZE); + + offset = offset_in_page(cq->db_addr); + num_db_pfn = DIV_ROUND_UP(offset + 8, PAGE_SIZE); + + offset = offset_in_page(&cq->arm_sn); + num_sn_pfn = DIV_ROUND_UP(offset + 4, PAGE_SIZE); + + cq_pkt_size = sizeof(struct pkt_nd_create_cq) + + (num_pfn * sizeof(u64)); + + ext_data_sz = sizeof(struct create_cq_ext_data) + + (num_pfn * sizeof(u64)); + ext_data_offset = offsetof(struct pkt_nd_create_cq, ext_data) - + sizeof(struct ndv_packet_hdr_control_1); + + hvnd_debug("CREATE CQ, num user addr pfns is %d\n", num_pfn); + hvnd_debug("CREATE CQ, num db pfns is %d\n", num_db_pfn); + + pkt = kzalloc(cq_pkt_size, GFP_KERNEL); + + if (!pkt) + return -ENOMEM; + + hvnd_init_hdr(&pkt->hdr, + (cq_pkt_size - + sizeof(struct ndv_packet_hdr_control_1)), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_CQ_CREATE, + ext_data_sz, + ext_data_offset, + 0); + + /* + * Now fill out the ioctl section. + */ + + pkt->ioctl.input.version = ND_VERSION_1; + pkt->ioctl.input.queue_depth = cq->entries; + pkt->ioctl.input.ce_mapping_cnt = MLX4_IB_CREATE_CQ_MAPPING_MAX; + pkt->ioctl.input.cb_mapping_offset = sizeof(union create_cq_ioctl); + + hvnd_debug("ce_mapping cnt is %d\n", pkt->ioctl.input.ce_mapping_cnt); + hvnd_debug("cb_mapping offset is %d\n", + pkt->ioctl.input.cb_mapping_offset); + + pkt->ioctl.input.adapter_handle = uctx->adaptor_hdl; + pkt->ioctl.input.affinity.mask = 0; + pkt->ioctl.input.affinity.group = -1; + + // 0 for usermode CQ arming + pkt->mappings.cq_in.flags = 0; + + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_BUF].map_memory.map_type = ND_MAP_MEMORY; + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_BUF].map_memory.access_type = ND_MODIFY_ACCESS; + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_BUF].map_memory.address = (u64)cq->cq_buf; + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_BUF].map_memory.cb_length = (cq->entries * uctx->cqe_size); + + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_DB].map_memory.map_type = ND_MAP_MEMORY_COALLESCE; + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_DB].map_memory.access_type = ND_WRITE_ACCESS; + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_DB].map_memory.address = (u64)cq->db_addr; + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_DB].map_memory.cb_length = 8; //size of two ints + + + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_ARM_SN].map_memory.map_type = ND_MAP_MEMORY; + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_ARM_SN].map_memory.access_type = ND_MODIFY_ACCESS; + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_ARM_SN].map_memory.address = (u64)&cq->arm_sn; + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_ARM_SN].map_memory.cb_length = 4; //size of one int + /* + * Fill in the extended data. + */ + + pkt->ext_data.cnt = 3; + pkt->ext_data.fields[MLX4_IB_CREATE_CQ_BUF].size = (sizeof(struct gpa_range) + (num_pfn * sizeof(u64))); + pkt->ext_data.fields[MLX4_IB_CREATE_CQ_BUF].offset = offsetof(struct create_cq_ext_data, cqbuf_gpa); + + pkt->ext_data.fields[MLX4_IB_CREATE_CQ_DB].size = sizeof(struct cq_db_gpa); + pkt->ext_data.fields[MLX4_IB_CREATE_CQ_DB].offset = offsetof(struct create_cq_ext_data, db_gpa); + + pkt->ext_data.fields[MLX4_IB_CREATE_CQ_ARM_SN].size = sizeof(struct cq_db_gpa); + pkt->ext_data.fields[MLX4_IB_CREATE_CQ_ARM_SN].offset = offsetof(struct create_cq_ext_data, sn_gpa); + + /* + * Fill up the gpa range for cq buffer. + */ + + pkt->ext_data.db_gpa.byte_count = 8; + pkt->ext_data.db_gpa.byte_offset = offset_in_page(cq->db_addr); + user_va_init_pfn(&pkt->ext_data.db_gpa.pfn_array[0], cq->db_umem); + + pkt->ext_data.sn_gpa.byte_count = 4; + pkt->ext_data.sn_gpa.byte_offset = offset_in_page(&cq->arm_sn); + init_pfn(&pkt->ext_data.sn_gpa.pfn_array[0], + &cq->arm_sn, + 4); + + pkt->ext_data.cqbuf_gpa.byte_count = (cq->entries * uctx->cqe_size); + pkt->ext_data.cqbuf_gpa.byte_offset = offset_in_page(cq->cq_buf); + user_va_init_pfn(&pkt->ext_data.cqbuf_gpa.pfn_array[0], cq->umem); + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt->hdr, cq_pkt_size, (u64)pkt); + + if (ret) + goto cr_cq_err; + + /* + * Copy the necessary response from the host. + */ + cq->cqn = pkt->mappings.cq_resp.cqn; + cq->cqe = pkt->mappings.cq_resp.cqe; + cq->cq_handle = pkt->ioctl.resrc_desc.handle; + + ret = insert_handle(nd_dev, &nd_dev->cqidr, cq, cq->cqn); + + if (ret) + goto cr_cq_err; + hvnd_debug("CQ create after success cqn is %d\n", cq->cqn); + hvnd_debug("CQ create after success cqe is %d\n", cq->cqe); + hvnd_debug("CQ create after success cq handle is %p\n", + (void *)cq->cq_handle); + +cr_cq_err: + kfree(pkt); + return ret; +} + +int hvnd_destroy_cq(struct hvnd_dev *nd_dev, struct hvnd_cq *cq) +{ + struct pkt_nd_free_cq free_cq_pkt; + int ret; + + // KYS try to avoid having to zero everything + memset(&free_cq_pkt, 0, sizeof(free_cq_pkt)); + hvnd_init_hdr(&free_cq_pkt.hdr, + sizeof(struct pkt_nd_free_cq) - + sizeof(struct ndv_packet_hdr_control_1), + cq->uctx->create_pkt.handle.local, + cq->uctx->create_pkt.handle.remote, + IOCTL_ND_CQ_FREE, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + free_cq_pkt.ioctl.in.version = ND_VERSION_1; + free_cq_pkt.ioctl.in.handle = cq->cq_handle; + + ret = hvnd_send_ioctl_pkt(nd_dev, &free_cq_pkt.hdr, + sizeof(struct pkt_nd_free_cq), + (u64)&free_cq_pkt); + + if (ret) + goto free_cq_err; + + remove_handle(nd_dev, &nd_dev->cqidr, cq->cqn); + + return 0; + +free_cq_err: + return ret; +} + + +int hvnd_notify_cq(struct hvnd_dev *nd_dev, struct hvnd_cq *cq, + u32 notify_type, u64 irp_handle) +{ + struct pkt_nd_notify_cq notify_cq_pkt; + int ret; + union ndv_context_handle irp_fhandle; + + irp_fhandle.local = cq->ep_object.local_irp; + + // KYS try to avoid having to zero everything + memset(¬ify_cq_pkt, 0, sizeof(notify_cq_pkt)); + hvnd_init_hdr(¬ify_cq_pkt.hdr, + sizeof(struct pkt_nd_notify_cq) - + sizeof(struct ndv_packet_hdr_control_1), + cq->uctx->create_pkt.handle.local, + cq->uctx->create_pkt.handle.remote, + IOCTL_ND_CQ_NOTIFY, 0, 0, irp_fhandle.val64); + + /* + * Now fill in the ioctl section. + */ + notify_cq_pkt.ioctl.in.version = ND_VERSION_1; + notify_cq_pkt.ioctl.in.cq_handle = cq->cq_handle; + notify_cq_pkt.ioctl.in.type = notify_type; + + + ret = hvnd_send_ioctl_pkt(nd_dev, ¬ify_cq_pkt.hdr, + sizeof(struct pkt_nd_notify_cq), + (u64)¬ify_cq_pkt); + + return ret; +} + +/* + * Memory region operations. + */ +int hvnd_cr_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 pd_handle, u64 *mr_handle) +{ + struct pkt_nd_create_mr pkt; + int ret; + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_MR_CREATE, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.version = ND_VERSION_1; + pkt.ioctl.in.handle = pd_handle; + hvnd_debug("PD handle is %p\n", (void *)pd_handle); + pkt.ioctl.in.reserved = 0; + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + /* + * Copy the handle. + */ + hvnd_debug("mr handle is %p\n", (void *)pkt.ioctl.out); + *mr_handle = pkt.ioctl.out; + + return 0; + +err: + hvnd_error("create mr failed: %d\n", ret); + return ret; + +} + +int hvnd_free_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 handle) +{ + return hvnd_free_handle(nd_dev, uctx, handle, IOCTL_ND_MR_FREE); +} + +int hvnd_deregister_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 handle) +{ + struct pkt_nd_deregister_mr pkt; + int ret; + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_MR_DEREGISTER, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.version = ND_VERSION_1; + pkt.ioctl.in.handle = handle; + pkt.ioctl.in.reserved = 0; + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + return 0; + +err: + hvnd_error("de-register mr failed: %d\n", ret); + return ret; + +} + +static inline u32 hvnd_convert_access(int acc) +{ + return (acc & IB_ACCESS_REMOTE_WRITE ? ND_MR_FLAG_ALLOW_REMOTE_WRITE : 0) | + (acc & IB_ACCESS_REMOTE_READ ? ND_MR_FLAG_ALLOW_REMOTE_READ : 0) | + (acc & IB_ACCESS_LOCAL_WRITE ? ND_MR_FLAG_ALLOW_LOCAL_WRITE : 0); +} + + +int hvnd_mr_register(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + struct hvnd_mr *mr) +{ + struct pkt_nd_register_mr pkt; + int ret; + struct hv_mpb_array *pb; + struct vmbus_packet_mpb_array *tpb; + int sz_leaf; + int num_pgs; + int i = 0; + int ext_data_sz; + u32 acc_flags; + u32 desc_size; + int pkt_type; + + /* + * The user address is passed in via a two level structure. + * An Array of struct hv_page_buffer will be used to describe + * the user memory. The pages containing this array will be descibed + * in another array of struct hv_page_buffer. We pass this seconed level + * array to the host. + */ + + hvnd_debug("ib_umem_page_count(mr->umem)=%d\n", + ib_umem_page_count(mr->umem)); + + sz_leaf = ib_umem_page_count(mr->umem) * sizeof(u64) + + sizeof(struct hv_mpb_array); + + pb = (struct hv_mpb_array *) __get_free_pages(GFP_KERNEL|__GFP_ZERO, + get_order(sz_leaf)); + + if (pb == NULL) + return -ENOMEM; + /* + * Allocate an array of hv_page_buffer to describe the first level. + */ + num_pgs = DIV_ROUND_UP(sz_leaf, PAGE_SIZE); + hvnd_debug("num pages in the top array is %d\n", num_pgs); + + desc_size = (num_pgs * sizeof(u64) + + sizeof(struct vmbus_packet_mpb_array)); + tpb = (struct vmbus_packet_mpb_array *) + __get_free_pages(GFP_KERNEL|__GFP_ZERO, get_order(desc_size)); + + if (tpb == NULL) { + free_pages((unsigned long)pb, get_order(sz_leaf)); + return -ENOMEM; + } + + hvnd_debug("sz leaf: %d; pgs in top %d\n", sz_leaf, num_pgs); + + /* + * Now fill the leaf level array. + */ + pb->len = mr->length; + pb->offset = offset_in_page(mr->start); + user_va_init_pfn(pb->pfn_array, mr->umem); + + /* + * Now fill out the top level array. + */ + for (i = 0; i < num_pgs; i++) { + tpb->range.pfn_array[i] = virt_to_phys( + (u8 *)pb + (PAGE_SIZE * i)) >> PAGE_SHIFT; + hvnd_debug("virtual address = %p\n", + (u8 *)pb + (PAGE_SIZE * i)); + hvnd_debug("physical address = %llx\n", + virt_to_phys((u8 *)pb + (PAGE_SIZE * i))); + hvnd_debug("tpb->range.pfn_array[%d]=%llx\n", i, + tpb->range.pfn_array[i]); + } + + tpb->range.offset = 8; + tpb->range.len = ib_umem_page_count(mr->umem) * sizeof(u64); + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + ext_data_sz = (ib_umem_page_count(mr->umem) * sizeof(u64)); + acc_flags = ND_MR_FLAG_DO_NOT_SECURE_VM | hvnd_convert_access(mr->acc); + hvnd_debug("memory register access flags are: %x\n", acc_flags); + + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_MR_REGISTER, 0, 0, 0); + + /* + * The memory registration call uses a different mechanism to pass + * pfn information. + */ + + pkt_type = pkt.hdr.pkt_hdr.packet_type; + NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTION_EXTERNAL_DATA); + pkt.hdr.pkt_hdr.packet_type = pkt_type; + + pkt.hdr.extended_data.size = ext_data_sz; + pkt.hdr.extended_data.offset = 0; + /* + * Now fill out the ioctl. + */ + + pkt.ioctl.in.header.version = ND_VERSION_1; + pkt.ioctl.in.header.flags = acc_flags; + pkt.ioctl.in.header.cb_length = mr->length; + pkt.ioctl.in.header.target_addr = mr->virt; + pkt.ioctl.in.header.mr_handle = mr->mr_handle; + pkt.ioctl.in.address = mr->virt; + + /* + * Now send the packet to the host. + */ + + ret = hvnd_send_pgbuf_ioctl_pkt(nd_dev, + tpb, desc_size, + &pkt.hdr, + sizeof(pkt), + (unsigned long)&pkt); + + if (ret) + goto err; + + hvnd_info("MR REGISTRATION SUCCESS\n"); + /* + * Copy the mr registration data. + */ + hvnd_debug("mr registration lkey %x\n", pkt.ioctl.out.lkey); + hvnd_debug("mr registration rkey %x\n", pkt.ioctl.out.rkey); + + mr->mr_lkey = pkt.ioctl.out.lkey; + mr->mr_rkey = pkt.ioctl.out.rkey; + + mr->ibmr.lkey = mr->mr_lkey; + mr->ibmr.rkey = be32_to_cpu(mr->mr_rkey); + hvnd_debug("ibmr registration lkey %x\n", mr->ibmr.lkey); + hvnd_debug("ibmr registration rkey %x\n", mr->ibmr.rkey); + + free_pages((unsigned long)pb, get_order(sz_leaf)); + free_pages((unsigned long)tpb, get_order(desc_size)); + + return 0; + +err: + free_pages((unsigned long)pb, get_order(sz_leaf)); + free_pages((unsigned long)tpb, get_order(desc_size)); + + hvnd_error("mr register failed: %d\n", ret); + return ret; +} + +/* + * Listener operations. + */ +int hvnd_cr_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 *listener_handle) +{ + struct pkt_nd_cr_listener pkt; + int ret; + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_LISTENER_CREATE, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.hdr.version = ND_VERSION_1; + pkt.ioctl.in.hdr.handle = uctx->adaptor_hdl; + hvnd_debug("Adaptor handle is %p\n", (void *)uctx->adaptor_hdl); + pkt.ioctl.in.hdr.reserved = 0; + pkt.ioctl.in.to_semantics = false; + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + /* + * Copy the listener handle. + */ + hvnd_debug("listener handle is %p\n", (void *)pkt.ioctl.out); + *listener_handle = pkt.ioctl.out; + + return 0; + +err: + hvnd_error("create listener failed: ret=%d uctx=%p adaptor handle=%llu\n", + ret, uctx, uctx->adaptor_hdl); + return ret; + +} + +int hvnd_free_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 listener_handle) +{ + struct pkt_nd_free_listener pkt; + int ret; + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_LISTENER_FREE, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.version = ND_VERSION_1; + pkt.ioctl.in.handle = listener_handle; + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + return 0; + +err: + hvnd_error("free listener failed: %d\n", ret); + return ret; +} + +int hvnd_bind_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 listener_handle, union nd_sockaddr_inet *addr) +{ + struct pkt_nd_bind_listener pkt; + kuid_t uid = current_uid(); + int ret; + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_LISTENER_BIND, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.hdr.version = ND_VERSION_1; + pkt.ioctl.in.hdr.handle = listener_handle; + pkt.ioctl.in.hdr.reserved = 0; + + pkt.ioctl.in.authentication_id = (u32)uid.val; + pkt.ioctl.in.is_admin = false; + + memcpy(&pkt.ioctl.in.hdr.address, addr, sizeof(*addr)); + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + return 0; + +err: + hvnd_error("bind listener failed: %d\n", ret); + return ret; +} + +int hvnd_listen_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 listener_handle, u32 backlog) +{ + struct pkt_nd_listen_listener pkt; + int ret; + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_LISTENER_LISTEN, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.version = ND_VERSION_1; + pkt.ioctl.in.listener_handle = listener_handle; + pkt.ioctl.in.back_log = backlog; + + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + return 0; + +err: + hvnd_error("listen listener failed: %d\n", ret); + return ret; +} + +int hvnd_get_addr_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 listener_handle, union nd_sockaddr_inet *addr) +{ + struct pkt_nd_get_addr_listener pkt; + int ret; + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_LISTENER_GET_ADDRESS, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.version = ND_VERSION_1; + pkt.ioctl.in.handle = listener_handle; + + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + /* + * Copy the adddress. + */ + + memcpy(addr, &pkt.ioctl.out, sizeof(union nd_sockaddr_inet)); + + return 0; + +err: + hvnd_error("listen listener failed: %d\n", ret); + return ret; +} + +int hvnd_get_connection_listener(struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx, + u64 listener_handle, u64 connector_handle, + u64 irp_handle) +{ + struct pkt_nd_get_connection_listener pkt; + int ret; + union ndv_context_handle irp_fhandle; + + ret = get_irp_handle(nd_dev, &irp_fhandle.local, (void *)irp_handle); + + if (ret) { + hvnd_error("get_irp_handle() failed: err: %d\n", ret); + return ret; + } + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST, 0, 0, + irp_fhandle.val64); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.version = ND_VERSION_1; + pkt.ioctl.in.listener_handle = listener_handle; + pkt.ioctl.in.connector_handle = connector_handle; + + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + return 0; + +err: + hvnd_error("get connection listener failed: %d\n", ret); + return ret; +} + +/* + * Connector APIs. + */ + +int hvnd_cr_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 *connector_handle) +{ + struct pkt_nd_cr_connector pkt; + int ret; + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(struct pkt_nd_cr_listener) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_CONNECTOR_CREATE, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.hdr.version = ND_VERSION_1; + pkt.ioctl.in.hdr.handle = uctx->adaptor_hdl; + pkt.ioctl.in.to_semantics = false; + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + /* + * Copy the listener handle. + */ + hvnd_debug("connector handle is %p\n", (void *)pkt.ioctl.out); + *connector_handle = pkt.ioctl.out; + + return 0; + +err: + return ret; +} + +int hvnd_free_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 handle) +{ + struct pkt_nd_free_connector pkt; + int ret; + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_CONNECTOR_FREE, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.version = ND_VERSION_1; + pkt.ioctl.in.handle = handle; + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + return 0; + +err: + return ret; +} + +int hvnd_bind_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 handle, union nd_sockaddr_inet *addr) +{ + struct pkt_nd_bind_connector pkt; + int ret; + kuid_t uid = current_uid(); + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_CONNECTOR_BIND, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.hdr.version = ND_VERSION_1; + pkt.ioctl.in.hdr.handle = handle; + + memcpy(&pkt.ioctl.in.hdr.address, addr, sizeof(*addr)); + + pkt.ioctl.in.authentication_id = (u32)uid.val; + pkt.ioctl.in.is_admin = false; + + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + return 0; + +err: + return ret; +} + +int hvnd_connector_connect(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 connector_handle, u32 in_rd_limit, u32 out_rd_limit, + u32 priv_data_length, const u8 *priv_data, + u64 qp_handle, struct if_physical_addr *phys_addr, + union nd_sockaddr_inet *dest_addr, + struct hvnd_ep_obj *ep) +{ + struct pkt_nd_connector_connect *pkt = &ep->connector_connect_pkt; + int ret; + union ndv_context_handle irp_fhandle; + + hvnd_debug("local irp is %d\n", ep->local_irp); + irp_fhandle.local = ep->local_irp; + + if (priv_data_length > MAX_PRIVATE_DATA_LEN) { + hvnd_error("priv_data_length=%d\n", priv_data_length); + return -EINVAL; + } + + // KYS try to avoid having to zero everything + memset(pkt, 0, sizeof(*pkt)); + hvnd_init_hdr(&pkt->hdr, + sizeof(*pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_CONNECTOR_CONNECT, 0, 0, irp_fhandle.val64); + + /* + * Now fill in the ioctl section. + */ + pkt->ioctl.in.hdr.version = ND_VERSION_1; + pkt->ioctl.in.hdr.connector_handle = connector_handle; + pkt->ioctl.in.hdr.read_limits.inbound = in_rd_limit; + pkt->ioctl.in.hdr.read_limits.outbound = out_rd_limit; + pkt->ioctl.in.hdr.cb_private_data_length = priv_data_length; + pkt->ioctl.in.hdr.cb_private_data_offset = + offsetof(union connector_connect_ioctl, in.priv_data); + pkt->ioctl.in.hdr.qp_handle = qp_handle; + + memcpy(&pkt->ioctl.in.hdr.phys_addr, phys_addr, + sizeof(struct if_physical_addr)); + + /* + * Luke's code does not copy the ip address. + */ + memcpy(&pkt->ioctl.in.hdr.destination_address, dest_addr, + sizeof(union nd_sockaddr_inet)); + + pkt->ioctl.in.retry_cnt = 7; + pkt->ioctl.in.rnr_retry_cnt = 7; + memcpy(pkt->ioctl.in.priv_data, priv_data, priv_data_length); + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt->hdr, sizeof(*pkt), (u64)pkt); + + if (ret) + goto err; + + return 0; + +err: + return ret; +} + +int hvnd_connector_complete_connect(struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx, + u64 connector_handle, + enum ibv_qp_state *qp_state) +{ + struct pkt_nd_connector_connect_complete pkt; + int ret; + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_CONNECTOR_COMPLETE_CONNECT, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.hdr.version = ND_VERSION_1; + pkt.ioctl.in.hdr.handle = connector_handle; + pkt.ioctl.in.rnr_nak_to = 0; + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + *qp_state = pkt.ioctl.out.state; + return 0; + +err: + return ret; +} + +int hvnd_connector_accept(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 connector_handle, + u64 qp_handle, + u32 in_rd_limit, u32 out_rd_limit, + u32 priv_data_length, const u8 *priv_data, + enum ibv_qp_state *qp_state, struct hvnd_ep_obj *ep) +{ + struct pkt_nd_connector_accept pkt; + int ret; + union ndv_context_handle irp_fhandle; + + irp_fhandle.local = ep->local_irp; + + if (priv_data_length > MAX_PRIVATE_DATA_LEN) { + hvnd_error("priv_data_length=%d\n", priv_data_length); + return -EINVAL; + } + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_CONNECTOR_ACCEPT, 0, 0, irp_fhandle.val64); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.hdr.version = ND_VERSION_1; + pkt.ioctl.in.hdr.reserved = 0; + pkt.ioctl.in.hdr.read_limits.inbound = in_rd_limit; + pkt.ioctl.in.hdr.read_limits.outbound = out_rd_limit; + pkt.ioctl.in.hdr.cb_private_data_length = priv_data_length; + + pkt.ioctl.in.hdr.cb_private_data_offset = + offsetof(struct connector_accept_in, private_data); + + pkt.ioctl.in.hdr.connector_handle = connector_handle; + pkt.ioctl.in.hdr.qp_handle = qp_handle; + + pkt.ioctl.in.rnr_nak_to = 0; + pkt.ioctl.in.rnr_retry_cnt = 7; + + + memcpy(pkt.ioctl.in.private_data, priv_data, priv_data_length); + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + *qp_state = pkt.ioctl.out.state; + return 0; + +err: + return ret; +} + +int hvnd_connector_reject(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 connector_handle, + u32 priv_data_length, u8 *priv_data, + enum ibv_qp_state *qp_state) +{ + struct pkt_nd_connector_reject pkt; + int ret; + + if (priv_data_length > MAX_PRIVATE_DATA_LEN) { + hvnd_error("priv_data_length=%d\n", priv_data_length); + return -EINVAL; + } + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_CONNECTOR_REJECT, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.hdr.version = ND_VERSION_1; + pkt.ioctl.in.hdr.reserved = 0; + pkt.ioctl.in.hdr.cb_private_data_length = priv_data_length; + + pkt.ioctl.in.hdr.cb_private_data_offset = + offsetof(struct connector_reject_in, private_data); + + pkt.ioctl.in.hdr.connector_handle = connector_handle; + + memcpy(pkt.ioctl.in.private_data, priv_data, priv_data_length); + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + *qp_state = pkt.ioctl.out.state; + return 0; + +err: + return ret; +} + +int hvnd_connector_get_rd_limits(struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx, + u64 connector_handle, + struct nd_read_limits *rd_limits) +{ + struct pkt_nd_connector_get_rd_limits pkt; + int ret; + + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_CONNECTOR_GET_READ_LIMITS, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.in.version = ND_VERSION_1; + pkt.ioctl.in.in.reserved = 0; + pkt.ioctl.in.in.handle = connector_handle; + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + *rd_limits = pkt.ioctl.out.out; + return 0; + +err: + return ret; +} + +int hvnd_connector_get_priv_data(struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx, + u64 connector_handle, + u8 *priv_data) +{ + struct pkt_nd_connector_get_priv_data pkt; + int ret; + + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.version = ND_VERSION_1; + pkt.ioctl.in.reserved = 0; + pkt.ioctl.in.handle = connector_handle; + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + memcpy(priv_data, pkt.ioctl.out, MAX_PRIVATE_DATA_LEN); + return 0; + +err: + return ret; +} + +int hvnd_connector_get_peer_addr(struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx, + u64 connector_handle, + union nd_sockaddr_inet *peer_addr) +{ + struct pkt_nd_connector_get_peer_addr pkt; + int ret; + + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.version = ND_VERSION_1; + pkt.ioctl.in.reserved = 0; + pkt.ioctl.in.handle = connector_handle; + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + memcpy(peer_addr, &pkt.ioctl.out, sizeof(union nd_sockaddr_inet)); + return 0; + +err: + return ret; +} + +int hvnd_connector_get_local_addr(struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx, + u64 connector_handle, + union nd_sockaddr_inet *addr) +{ + struct pkt_nd_connector_get_addr pkt; + int ret; + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_CONNECTOR_GET_ADDRESS, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.version = ND_VERSION_1; + pkt.ioctl.in.reserved = 0; + pkt.ioctl.in.handle = connector_handle; + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + memcpy(addr, &pkt.ioctl.out, sizeof(union nd_sockaddr_inet)); + return 0; + +err: + return ret; +} + + +int hvnd_connector_notify_disconnect(struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx, + u64 connector_handle, struct hvnd_ep_obj *ep) +{ + struct pkt_nd_connector_notify_disconnect pkt; + int ret; + union ndv_context_handle irp_fhandle; + + irp_fhandle .local = ep->local_irp; + + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT, 0, 0, + irp_fhandle.val64); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.version = ND_VERSION_1; + pkt.ioctl.in.reserved = 0; + pkt.ioctl.in.handle = connector_handle; + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + return 0; + +err: + return ret; +} + + +//ASYNCH call +int hvnd_connector_disconnect(struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx, + u64 connector_handle, struct hvnd_ep_obj *ep) +{ + struct pkt_nd_connector_disconnect pkt; + int ret; + union ndv_context_handle irp_fhandle; + + irp_fhandle.local = ep->local_irp; + + + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_CONNECTOR_DISCONNECT, 0, 0, irp_fhandle.val64); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.version = ND_VERSION_1; + pkt.ioctl.in.reserved = 0; + pkt.ioctl.in.handle = connector_handle; + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + return 0; + +err: + return ret; +} + +/* + * QP operations. + */ +int hvnd_create_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + struct hvnd_qp *qp) +{ + int ret; + struct pkt_nd_create_qp *pkt; + int num_pfn, num_db_pfn; + int qp_pkt_size; + unsigned int offset; + u32 ext_data_offset; + u32 ext_data_size; + + /* + * Now create QP. + * First compute the number of PFNs we need to accomodate: + * One each for door bell and arm_sn and pages in cq buffer. + */ + offset = offset_in_page(qp->qp_buf); + num_pfn = DIV_ROUND_UP(offset + qp->buf_size, PAGE_SIZE); + + offset = offset_in_page(qp->db_addr); + num_db_pfn = DIV_ROUND_UP(offset + 4, PAGE_SIZE); + + qp_pkt_size = sizeof(struct pkt_nd_create_qp) + + (num_pfn * sizeof(u64)); + + hvnd_debug("CREATE QP, num pfns is %d\n", num_pfn); + hvnd_debug("CREATE QP, num DB pfns is %d\n", num_db_pfn); + + pkt = kzalloc(qp_pkt_size, GFP_KERNEL); + + if (!pkt) + return -ENOMEM; + + hvnd_debug("offset of nd_create_qp is %d\n", + (int)offsetof(struct pkt_nd_create_qp, ioctl.input)); + + ext_data_offset = offsetof(struct pkt_nd_create_qp, ext_data) - + sizeof(struct ndv_packet_hdr_control_1); + + ext_data_size = sizeof(struct create_qp_ext_data) + + (num_pfn * sizeof(u64)); + + hvnd_init_hdr(&pkt->hdr, + qp_pkt_size - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_QP_CREATE, + ext_data_size, + ext_data_offset, + 0); + + /* + * Now fill out the ioctl section. + */ + + pkt->ioctl.input.hdr.version = ND_VERSION_1; + + if (qp->max_inline_data > nd_dev->query_pkt.ioctl.ad_info.inline_request_threshold) + qp->max_inline_data = nd_dev->query_pkt.ioctl.ad_info.inline_request_threshold; + pkt->ioctl.input.hdr.cb_max_inline_data = qp->max_inline_data; + + hvnd_debug("pkt->ioctl.input.hdr.cb_max_inline_data=%d\n", + pkt->ioctl.input.hdr.cb_max_inline_data); + + pkt->ioctl.input.hdr.ce_mapping_cnt = MLX4_IB_CREATE_QP_MAPPINGS_MAX; + pkt->ioctl.input.hdr.cb_mapping_offset = sizeof(union create_qp_ioctl); + + pkt->ioctl.input.hdr.initiator_queue_depth = qp->initiator_q_depth; + pkt->ioctl.input.hdr.max_initiator_request_sge = + qp->initiator_request_sge; + + hvnd_debug("recv cq handle is %p\n", (void *)qp->receive_cq_handle); + hvnd_debug("send cq handle is %p\n", (void *)qp->initiator_cq_handle); + hvnd_debug("pd handle is %p\n", (void *)qp->pd_handle); + pkt->ioctl.input.hdr.receive_cq_handle = qp->receive_cq_handle; + pkt->ioctl.input.hdr.initiator_cq_handle = qp->initiator_cq_handle; + pkt->ioctl.input.hdr.pd_handle = qp->pd_handle; + + + hvnd_debug("ce_mapping cnt is %d\n", + pkt->ioctl.input.hdr.ce_mapping_cnt); + hvnd_debug("cb_mapping offset is %d\n", + pkt->ioctl.input.hdr.cb_mapping_offset); + + pkt->ioctl.input.receive_queue_depth = qp->receive_q_depth; + pkt->ioctl.input.max_receive_request_sge = qp->receive_request_sge; + + + pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_BUF].map_memory.map_type = ND_MAP_MEMORY; + pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_BUF].map_memory.access_type = ND_MODIFY_ACCESS; + pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_BUF].map_memory.address = (u64)qp->qp_buf; + pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_BUF].map_memory.cb_length = qp->buf_size; + + pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_DB].map_memory.map_type = ND_MAP_MEMORY_COALLESCE; + pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_DB].map_memory.access_type = ND_WRITE_ACCESS; + pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_DB].map_memory.address = (u64)qp->db_addr; + pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_DB].map_memory.cb_length = 4; + + pkt->mappings.qp_in.log_sq_bb_count = qp->log_sq_bb_count; + pkt->mappings.qp_in.log_sq_stride = qp->log_sq_stride; + pkt->mappings.qp_in.sq_no_prefetch = qp->sq_no_prefetch; + + + /* + * Fill in the extended data. + */ + + pkt->ext_data.cnt = 2; + pkt->ext_data.fields[MLX4_IB_CREATE_QP_BUF].size = + sizeof(struct gpa_range) + (num_pfn * sizeof(u64)); + pkt->ext_data.fields[MLX4_IB_CREATE_QP_BUF].offset = + offsetof(struct create_qp_ext_data, qpbuf_gpa); + + pkt->ext_data.fields[MLX4_IB_CREATE_QP_DB].size = + sizeof(struct qp_db_gpa); + pkt->ext_data.fields[MLX4_IB_CREATE_QP_DB].offset = + offsetof(struct create_qp_ext_data, db_gpa); + + /* + * Fill up the gpa range for qp buffer. + */ + + pkt->ext_data.db_gpa.byte_count = 4; // KYS 8 or 16? + pkt->ext_data.db_gpa.byte_offset = offset_in_page(qp->db_addr); + user_va_init_pfn(&pkt->ext_data.db_gpa.pfn_array[0], qp->db_umem); + + pkt->ext_data.qpbuf_gpa.byte_count = qp->buf_size; + pkt->ext_data.qpbuf_gpa.byte_offset = offset_in_page(qp->qp_buf); + user_va_init_pfn(&pkt->ext_data.qpbuf_gpa.pfn_array[0], qp->umem); + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt->hdr, qp_pkt_size, (u64)pkt); + + if (ret) + goto cr_qp_err; + + /* + * Copy the necessary response from the host. + */ + qp->qp_handle = pkt->ioctl.resrc_desc.handle; + + qp->qpn = pkt->mappings.qp_resp.qpn; + qp->max_send_wr = pkt->mappings.qp_resp.max_send_wr; + qp->max_recv_wr = pkt->mappings.qp_resp.max_recv_wr; + qp->max_send_sge = pkt->mappings.qp_resp.max_send_sge; + qp->max_recv_sge = pkt->mappings.qp_resp.max_recv_sge; + + + hvnd_debug("qp->max_send_wr=%d max_recv_wr=%d max_send_sge=%d max_recv_sge=%d max_inline_data=%d\n", + qp->max_send_wr, qp->max_recv_wr, qp->max_send_sge, + qp->max_recv_sge, qp->max_inline_data); + + ret = insert_handle(nd_dev, &nd_dev->qpidr, qp, qp->qpn); + + if (ret) + goto cr_qp_err; + + hvnd_debug("QP create after success qpn:%d qp:%p handle:%llu\n", + qp->qpn, qp, qp->qp_handle); + +cr_qp_err: + kfree(pkt); + return ret; +} + +int hvnd_free_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + struct hvnd_qp *qp) +{ + int ret; + + ret = hvnd_free_handle(nd_dev, uctx, qp->qp_handle, IOCTL_ND_QP_FREE); + if (ret == 0) + remove_handle(nd_dev, &nd_dev->qpidr, qp->qpn); + return ret; +} + +int hvnd_flush_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + struct hvnd_qp *qp) +{ + struct pkt_nd_flush_qp pkt; + int ret; + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_QP_FLUSH, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.version = ND_VERSION_1; + pkt.ioctl.in.reserved = 0; + pkt.ioctl.in.handle = qp->qp_handle; + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + return 0; + +err: + return ret; +} + + +int hvnd_bind_nic(struct hvnd_dev *nd_dev, bool un_bind, char *ip_addr, + char *mac_addr) +{ + int ret; + int pkt_type = NDV_PKT_ID1_BIND; + + /* + * Send the bind information over to the host. + * For now, we will have a single ip and MAC address that we + * will deal with. Down the road we will need to expand support + * for multiple IP and MAC addresses and also deal with changing + * IP addresses. + */ + + NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); + hvnd_debug("bind packet type is %d ID:%d\n", + pkt_type, NDV_PACKET_TYPE_ID(pkt_type)); + nd_dev->bind_pkt.pkt_hdr.packet_type = pkt_type; + + nd_dev->bind_pkt.pkt_hdr.hdr_sz = sizeof(struct ndv_pkt_hdr_bind_1); + hvnd_debug("bind packet size is %d\n", + (int)sizeof(struct ndv_pkt_hdr_bind_1)); + nd_dev->bind_pkt.pkt_hdr.data_sz = 0; + nd_dev->bind_pkt.unbind = un_bind; + nd_dev->bind_pkt.ip_address.address_family = AF_INET; + nd_dev->bind_pkt.ip_address.ipv4.sin_family = AF_INET; + nd_dev->bind_pkt.ip_address.ipv4.sin_port = 0; + nd_dev->bind_pkt.ip_address.ipv4.sin_addr.s_addr = *(unsigned int *)ip_addr; + + nd_dev->bind_pkt.phys_addr.length = ETH_ALEN; + memcpy(nd_dev->bind_pkt.phys_addr.addr, mac_addr, ETH_ALEN); + + /* + * This is the adapter handle; needs to be unique for each + * MAC, ip address tuple. + */ + nd_dev->bind_pkt.guest_id = (u64)nd_dev; + + ret = hvnd_send_packet(nd_dev, &nd_dev->bind_pkt, + sizeof(struct ndv_pkt_hdr_bind_1), + (u64)NULL, + true); + return ret; +} + +int hvnd_init_resources(struct hvnd_dev *nd_dev) +{ + unsigned long mmio_sz; + struct resource *resrc; + int ret = -ENOMEM; + + resrc = &iomem_resource; + + mmio_sz = nd_dev->hvdev->channel->offermsg.offer.mmio_megabytes * + 1024 * 1024; + nd_dev->mmio_sz = mmio_sz; + nd_dev->mmio_resource.name = KBUILD_MODNAME; + nd_dev->mmio_resource.flags = IORESOURCE_MEM | IORESOURCE_BUSY; + + ret = allocate_resource(resrc, &nd_dev->mmio_resource, + mmio_sz, 0, -1, mmio_sz, NULL, NULL); + + if (ret) { + hvnd_error("Unable to allocate mmio resources\n"); + return ret; + } + hvnd_debug("MMIO start is %p\n", (void *)nd_dev->mmio_resource.start); + + /* + * Send the mmio information over to the host. + */ + nd_dev->resources.pkt_hdr.packet_type = NDV_PKT_ID1_INIT_RESOURCES; + nd_dev->resources.pkt_hdr.hdr_sz = sizeof(union ndv_packet_hdr); + nd_dev->resources.pkt_hdr.data_sz = 0; + + nd_dev->resources.io_space_sz_mb = mmio_sz; + nd_dev->resources.io_space_start = nd_dev->mmio_resource.start; + + ret = hvnd_send_packet(nd_dev, &nd_dev->resources, + sizeof(struct ndv_pkt_hdr_init_resources_1), + (u64)NULL, + true); + return ret; +} + +int hvnd_query_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx) +{ + struct pkt_nd_query_adaptor *pkt; + int ret; + int pkt_type; + struct ib_device_attr *props = &nd_dev->ibdev.attrs; + struct adapter_info_v2 *adap_info = &nd_dev->query_pkt.ioctl.ad_info; + + hvnd_debug("Performing Adapter query nd_dev=%p\n", nd_dev); + + // check if there is a need to do query + if (nd_dev->query_pkt_set) + return 0; + + // need a lock, multiple process can call this at the same time + + down(&nd_dev->query_pkt_sem); + if (nd_dev->query_pkt_set) { + up(&nd_dev->query_pkt_sem); + return 0; + } + + /* + * Now query the adaptor. + */ + + pkt = &nd_dev->query_pkt; + + pkt_type = NDV_PKT_ID1_CONTROL; + NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); + + pkt->hdr.pkt_hdr.packet_type = pkt_type; + pkt->hdr.pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); + pkt->hdr.pkt_hdr.data_sz = sizeof(struct pkt_nd_query_adaptor) - + sizeof(struct ndv_packet_hdr_control_1); + + + pkt->hdr.file_handle.local = uctx->file_handle.local; + pkt->hdr.file_handle.remote = uctx->file_handle.remote; + + pkt->hdr.irp_handle.val64 = 0; + + pkt->hdr.io_cntrl_code = IOCTL_ND_ADAPTER_QUERY; + pkt->hdr.output_buf_sz = sizeof(struct nd_adap_query_ioctl); + pkt->hdr.input_buf_sz = sizeof(struct nd_adap_query_ioctl); + pkt->hdr.input_output_buf_offset = 0; + memset(&pkt->ioctl.ad_q, 0, sizeof(struct nd_adap_query_ioctl)); + + pkt->ioctl.ad_q.version = ND_VERSION_1; + pkt->ioctl.ad_q.info_version = ND_VERSION_2; + pkt->ioctl.ad_q.adapter_handle = uctx->adaptor_hdl; + + ret = hvnd_send_packet(nd_dev, pkt, + sizeof(struct pkt_nd_query_adaptor), + (unsigned long)pkt, true); + + hvnd_debug("pkt->ioctl.ad_info.inline_request_threshold=%d\n", + pkt->ioctl.ad_info.inline_request_threshold); + nd_dev->query_pkt_set = true; + + // how about host returning PENDING + up(&nd_dev->query_pkt_sem); + + if (ret) + return ret; + + hvnd_debug("Query Adaptor Succeeded\n"); + + /* + * Cache the relevant properties out. + */ + props->fw_ver = 0; + props->device_cap_flags = 0; + props->vendor_id = 0x15b3; + props->vendor_part_id = adap_info->device_id; + props->max_mr_size = ~0ull; + props->page_size_cap = PAGE_SIZE; + props->max_qp = 16384; + props->max_qp_wr = min(adap_info->max_recv_q_depth, + adap_info->max_initiator_q_depth); + props->max_sge = min(adap_info->max_initiator_sge, + adap_info->max_recv_sge); + props->max_cq = 0x1FFFF; + props->max_cqe = adap_info->max_completion_q_depth; + props->max_mr = 16384; + props->max_pd = 16384; + props->max_qp_rd_atom = adap_info->max_inbound_read_limit; + props->max_qp_init_rd_atom = adap_info->max_outbound_read_limit; + props->max_res_rd_atom = props->max_qp_rd_atom * props->max_qp; + props->max_srq = 16384; + props->max_srq_wr = adap_info->max_recv_q_depth; + props->max_srq_sge = adap_info->max_recv_sge; + return 0; +} + + +int hvnd_create_pd(struct hvnd_ucontext *uctx, struct hvnd_dev *nd_dev, + struct hvnd_ib_pd *hvnd_pd) +{ + struct pkt_nd_pd_create *pkt = &uctx->pd_cr_pkt; + int ret; + int pkt_type; + + hvnd_debug("Create Protection Domain\n"); + + pkt_type = NDV_PKT_ID1_CONTROL; + NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); + + pkt->hdr.pkt_hdr.packet_type = pkt_type; + pkt->hdr.pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); + pkt->hdr.pkt_hdr.data_sz = sizeof(struct pkt_nd_pd_create) - + sizeof(struct ndv_packet_hdr_control_1); + + hvnd_debug("pdcreate packet size: %d\n", + (int)sizeof(struct pkt_nd_pd_create)); + hvnd_debug("pdcreate hdr size: %d\n", + (int)sizeof(struct ndv_packet_hdr_control_1)); + hvnd_debug("pdcreate data size: %d\n", pkt->hdr.pkt_hdr.data_sz); + + pkt->hdr.file_handle.local = uctx->create_pkt.handle.local; + pkt->hdr.file_handle.remote = uctx->create_pkt.handle.remote; + + hvnd_debug("create pd uctx is %p\n", uctx); + hvnd_debug("create pd local file is %d\n", + uctx->create_pkt.handle.local); + hvnd_debug("create pd local file is %d\n", + uctx->create_pkt.handle.remote); + + pkt->hdr.irp_handle.val64 = 0; + pkt->hdr.io_cntrl_code = IOCTL_ND_PD_CREATE; + + pkt->hdr.output_buf_sz = sizeof(struct nd_create_pd_ioctl); + pkt->hdr.input_buf_sz = sizeof(struct nd_create_pd_ioctl); + pkt->hdr.input_output_buf_offset = 0; + + hvnd_debug("output/input buf size: %d\n", pkt->hdr.output_buf_sz); + + /* + * Fill the ioctl section. + */ + pkt->ioctl.in.version = ND_VERSION_1; + pkt->ioctl.in.reserved = 0; + pkt->ioctl.in.handle = uctx->adaptor_hdl; + + + ret = hvnd_send_packet(nd_dev, pkt, + sizeof(struct pkt_nd_pd_create), + (unsigned long)pkt, true); + + if (ret) + return ret; + + if (pkt->hdr.pkt_hdr.status != 0) { + hvnd_error("Create PD failed; status is %d\n", + pkt->hdr.pkt_hdr.status); + return -EINVAL; + } + if (pkt->hdr.io_status != 0) { + hvnd_error("Create PD failed;io status is %d\n", + pkt->hdr.io_status); + return -EINVAL; + } + + hvnd_debug("Create PD Succeeded\n"); + + hvnd_debug("pd_handle is %p\n", (void *)pkt->ioctl.resp.pd_handle); + hvnd_debug("pdn is %d\n", (int)pkt->ioctl.resp.pdn); + + hvnd_pd->pdn = pkt->ioctl.resp.pdn; + hvnd_pd->handle = pkt->ioctl.out_handle; + + return 0; +} + +int hvnd_cancel_io(struct hvnd_ep_obj *ep_object) +{ + struct pkt_nd_cancel_io pkt; + int ret; + u32 ioctl; + + switch (ep_object->type) { + case ND_LISTENER: + hvnd_debug("LISTENER I/O Cancelled\n"); + ioctl = IOCTL_ND_LISTENER_CANCEL_IO; + break; + case ND_CONNECTOR: + hvnd_debug("CONNECTOR I/O Cancelled\n"); + ioctl = IOCTL_ND_CONNECTOR_CANCEL_IO; + break; + case ND_MR: + hvnd_debug("MR I/O Cancelled\n"); + ioctl = IOCTL_ND_MR_CANCEL_IO; + break; + case ND_CQ: + hvnd_debug("CQ I/O Cancelled\n"); + ioctl = IOCTL_ND_CQ_CANCEL_IO; + break; + default: + hvnd_error("UNKNOWN object type\n"); + return -EINVAL; + } + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + ep_object->uctx->create_pkt.handle.local, + ep_object->uctx->create_pkt.handle.remote, + ioctl, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.version = ND_VERSION_1; + pkt.ioctl.in.reserved = 0; + pkt.ioctl.in.handle = ep_object->ep_handle; + hvnd_debug("cancel io handle is %p\n", (void *)ep_object->ep_handle); + + ret = hvnd_send_ioctl_pkt(ep_object->nd_dev, &pkt.hdr, + sizeof(pkt), + (u64)&pkt); + + if (ret) + goto err; + + /* + * Now that we have cancelled all I/Os, + */ + + return 0; + +err: + hvnd_error("cancel I/O operation failed\n"); + return ret; +} + + +int hvnd_free_handle(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 handle, u32 ioctl) +{ + struct pkt_nd_free_handle pkt; + int ret; + + hvnd_debug("Freeing handle ioctl is %s; handle is %p\n", + hvnd_get_op_name(ioctl), (void *)handle); + + hvnd_debug("uctx is %p\n", uctx); + hvnd_debug("nd_dev is %p\n", nd_dev); + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + ioctl, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.version = ND_VERSION_1; + pkt.ioctl.in.reserved = 0; + pkt.ioctl.in.handle = handle; + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + return 0; + +err: + hvnd_error("%s: ret=%d\n", __func__, ret); + return ret; +} + +int hvnd_negotiate_version(struct hvnd_dev *nd_dev) +{ + union ndv_packet_init *pkt = &nd_dev->init_pkt; + int ret; + + nd_dev->negotiated_version = NDV_PROTOCOL_VAERSION_INVALID; + + pkt->packet_type = NDV_PACKET_TYPE_INIT; + pkt->protocol_version = NDV_PROTOCOL_VERSION_CURRENT; + pkt->flags = 0; // KYS are the flags 0? + + ret = hvnd_send_packet(nd_dev, pkt, + sizeof(union ndv_packet_init), (u64)NULL, true); + + return ret; +} + +void hvnd_callback(void *context) +{ + struct hv_device *dev = context; + struct hvnd_dev *nd_dev = hv_get_drvdata(dev); + int copy_sz = 0; + struct ndv_packet_hdr_control_1 *ctrl_hdr; + union ndv_packet_init *pkt_init; + u32 recvlen; + u32 local_irp; + u64 requestid; + u32 *pkt_type; + u32 pkt_id; + struct hvnd_ep_obj *ep_object; + struct incoming_pkt *incoming_pkt; /* Used only for asynch calls */ + char *incoming_pkt_start; + struct vmpacket_descriptor *desc; + int status; + struct hvnd_cookie *hvnd_cookie; + unsigned long flags; + + vmbus_recvpacket_raw(dev->channel, hvnd_recv_buffer, + (PAGE_SIZE * 4), &recvlen, &requestid); + + if (recvlen <= 0) + return; + + desc = (struct vmpacket_descriptor *)hvnd_recv_buffer; + incoming_pkt_start = hvnd_recv_buffer + (desc->offset8 << 3); + recvlen -= desc->offset8 << 3; + + pkt_type = (u32 *)incoming_pkt_start; + pkt_id = *pkt_type; + if (pkt_id != NDV_PACKET_TYPE_INIT) + pkt_id = NDV_PACKET_TYPE_ID(pkt_id); + + switch (pkt_id) { + case NDV_PACKET_TYPE_INIT: + /* + * Host is responding to our init packet. + */ + pkt_init = (union ndv_packet_init *)incoming_pkt_start; + nd_dev->negotiated_version = pkt_init->protocol_version; + copy_sz = 0; + break; + + case NDV_PKT_ID1_INIT_RESOURCES: + copy_sz = 0; + break; + + case NDV_PKT_ID1_BIND: + nd_dev->bind_pkt.pkt_hdr.status = + ((union ndv_packet_hdr *) incoming_pkt_start)->status; + copy_sz = 0; + break; + + case NDV_PKT_ID1_COMPLETE: + ctrl_hdr = (struct ndv_packet_hdr_control_1 *)incoming_pkt_start; + status = ctrl_hdr->io_status; + + local_irp = ctrl_hdr->irp_handle.local; + ep_object = (struct hvnd_ep_obj *)map_irp_to_ctx(nd_dev, + local_irp); + + if (!ep_object) { + hvnd_error("irp could not be mapped; irp is %d ioctl is %s", + local_irp, + hvnd_get_op_name(ctrl_hdr->io_cntrl_code)); + goto complete; + } + + if (ctrl_hdr->io_cntrl_code != IOCTL_ND_CQ_NOTIFY) + hvnd_debug("completion packet; iostatus is %x, ioctl is %s", + ctrl_hdr->io_status, + hvnd_get_op_name(ctrl_hdr->io_cntrl_code)); + + switch (ctrl_hdr->io_cntrl_code) { + + case IOCTL_ND_CQ_NOTIFY: + hvnd_process_cq_event_complete(ep_object, status); + + ep_del_work_pending(ep_object); + goto complete; + + case IOCTL_ND_CONNECTOR_ACCEPT: + + hvnd_process_connector_accept(ep_object, status); + + ep_del_work_pending(ep_object); + goto complete; + + case IOCTL_ND_CONNECTOR_DISCONNECT: + hvnd_debug("disconnected: ep opj is %p; status: %d\n", + ep_object, status); + hvnd_process_disconnect(ep_object, status); + + ep_del_work_pending(ep_object); + goto complete; + + default: + break; + } + + /* + * This is the completion notification; + * the IRP cookie is the state through which + * we will invoke the callback. + */ + incoming_pkt = kmalloc(recvlen + sizeof(struct incoming_pkt), + GFP_ATOMIC); + if (incoming_pkt == NULL) { + hvnd_error("Could not alloc memory in callback\n"); + ep_del_work_pending(ep_object); + goto complete; + } + memcpy(incoming_pkt->pkt, incoming_pkt_start, recvlen); + + spin_lock_irqsave(&ep_object->incoming_pkt_list_lock, flags); + list_add_tail(&incoming_pkt->list_entry, + &ep_object->incoming_pkt_list); + spin_unlock_irqrestore(&ep_object->incoming_pkt_list_lock, + flags); + + schedule_work(&ep_object->wrk.work); + + goto complete; + + case NDV_PKT_ID1_CREATE: + copy_sz = sizeof(struct ndv_pkt_hdr_create_1); + break; + + case NDV_PKT_ID1_CLEANUP: + copy_sz = sizeof(struct ndv_pkt_hdr_cleanup_1); + break; + + case NDV_PKT_ID1_CONTROL: + ctrl_hdr = (struct ndv_packet_hdr_control_1 *)incoming_pkt_start; + status = ctrl_hdr->io_status; + + if (ctrl_hdr->io_cntrl_code != IOCTL_ND_CQ_NOTIFY) + hvnd_debug("packet; iostatus is %x ioctl is %s", + ctrl_hdr->io_status, hvnd_get_op_name(ctrl_hdr->io_cntrl_code)); + + switch (ctrl_hdr->io_cntrl_code) { + + case IOCTL_ND_PROVIDER_INIT: + copy_sz = sizeof(struct pkt_nd_provider_ioctl); + break; + + case IOCTL_ND_PROVIDER_BIND_FILE: + copy_sz = sizeof(struct pkt_nd_provider_ioctl); + break; + + case IOCTL_ND_ADAPTER_OPEN: + copy_sz = sizeof(struct pkt_nd_open_adapter); + break; + + case IOCTL_ND_ADAPTER_CLOSE: + copy_sz = sizeof(struct pkt_nd_free_handle); + break; + + case IOCTL_ND_ADAPTER_QUERY: + copy_sz = sizeof(struct pkt_nd_query_adaptor); + break; + + case IOCTL_ND_PD_CREATE: + copy_sz = sizeof(struct pkt_nd_pd_create); + break; + + case IOCTL_ND_PD_FREE: + copy_sz = sizeof(struct pkt_nd_free_handle); + break; + + case IOCTL_ND_CQ_CREATE: + copy_sz = sizeof(struct pkt_nd_create_cq); + break; + + case IOCTL_ND_CQ_FREE: + copy_sz = sizeof(struct pkt_nd_free_cq); + break; + + case IOCTL_ND_CQ_NOTIFY: //FIXME check ep stop state + local_irp = ctrl_hdr->irp_handle.local; + ep_object = (struct hvnd_ep_obj *) + map_irp_to_ctx(nd_dev, local_irp); + if (!ep_object) { + hvnd_error("irp could not be mapped\n"); + goto complete; + return; + } + copy_sz = sizeof(struct pkt_nd_notify_cq); + hvnd_process_cq_event_pending(ep_object, status); + goto complete; + return; + + case IOCTL_ND_LISTENER_CREATE: + copy_sz = sizeof(struct pkt_nd_cr_listener); + break; + + case IOCTL_ND_LISTENER_FREE: + copy_sz = sizeof(struct pkt_nd_free_listener); + break; + + case IOCTL_ND_QP_FREE: + copy_sz = sizeof(struct pkt_nd_free_handle); + break; + + case IOCTL_ND_CONNECTOR_CANCEL_IO: + case IOCTL_ND_MR_CANCEL_IO: + case IOCTL_ND_CQ_CANCEL_IO: + case IOCTL_ND_LISTENER_CANCEL_IO: + copy_sz = sizeof(struct pkt_nd_cancel_io); + break; + + case IOCTL_ND_LISTENER_BIND: + copy_sz = sizeof(struct pkt_nd_bind_listener); + break; + + case IOCTL_ND_LISTENER_LISTEN: + copy_sz = sizeof(struct pkt_nd_listen_listener); + break; + + case IOCTL_ND_LISTENER_GET_ADDRESS: + copy_sz = sizeof(struct pkt_nd_get_addr_listener); + break; + + case IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST: + copy_sz = sizeof(struct pkt_nd_get_connection_listener); + goto complete; // non-block + + case IOCTL_ND_CONNECTOR_CREATE: + copy_sz = sizeof(struct pkt_nd_cr_connector); + break; + + case IOCTL_ND_CONNECTOR_FREE: + copy_sz = sizeof(struct pkt_nd_free_connector); + break; + + case IOCTL_ND_CONNECTOR_BIND: + copy_sz = sizeof(struct pkt_nd_free_connector); + break; + + // KYS: ALERT: ASYNCH Operation + case IOCTL_ND_CONNECTOR_CONNECT: + copy_sz = sizeof(struct pkt_nd_connector_connect); + goto complete; //non-block + + case IOCTL_ND_CONNECTOR_COMPLETE_CONNECT: + copy_sz = sizeof(struct pkt_nd_connector_connect_complete); + break; + + // KYS: ALERT: ASYNCH Operation + case IOCTL_ND_CONNECTOR_ACCEPT: + copy_sz = sizeof(struct pkt_nd_connector_accept); + goto complete; //non-block + + case IOCTL_ND_CONNECTOR_REJECT: + copy_sz = sizeof(struct pkt_nd_connector_reject); + break; + + case IOCTL_ND_CONNECTOR_GET_READ_LIMITS: + copy_sz = sizeof(struct pkt_nd_connector_get_rd_limits); + break; + + case IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA: + copy_sz = sizeof(struct pkt_nd_connector_get_priv_data); + break; + + case IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS: + copy_sz = sizeof(struct pkt_nd_connector_get_peer_addr); + break; + + case IOCTL_ND_CONNECTOR_GET_ADDRESS: + copy_sz = sizeof(struct pkt_nd_connector_get_addr); + break; + + // KYS: ALERT: ASYNCH Operation + case IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT: + copy_sz = sizeof(struct pkt_nd_connector_notify_disconnect); + goto complete; //non-block + + // KYS: ALERT: ASYNCH Operation + case IOCTL_ND_CONNECTOR_DISCONNECT: + hvnd_debug("IOCTL_ND_CONNECTOR_DISCONNECT\n"); + copy_sz = sizeof(struct pkt_nd_connector_notify_disconnect); + goto complete; // non-block + + case IOCTL_ND_QP_CREATE: + copy_sz = sizeof(struct pkt_nd_create_qp); + break; + + case IOCTL_ND_MR_CREATE: + copy_sz = sizeof(struct pkt_nd_create_mr); + break; + + case IOCTL_ND_MR_FREE: + copy_sz = sizeof(struct pkt_nd_free_handle); + break; + + case IOCTL_ND_MR_REGISTER: + copy_sz = sizeof(struct pkt_nd_register_mr); + break; + + case IOCTL_ND_MR_DEREGISTER: + copy_sz = sizeof(struct pkt_nd_deregister_mr); + break; + + case IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST: + copy_sz = sizeof(struct pkt_query_addr_list); + break; + + case IOCTL_ND_QP_FLUSH: + copy_sz = sizeof(struct pkt_nd_flush_qp); + break; + + default: + hvnd_warn("Got unknown ioctl: %d\n", + ctrl_hdr->io_cntrl_code); + copy_sz = 0; + break; + } + + break; + default: + hvnd_warn("Got an unknown packet type %d\n", *pkt_type); + break; + } + + hvnd_cookie = (struct hvnd_cookie *)requestid; + memcpy(hvnd_cookie->pkt, incoming_pkt_start, copy_sz); + complete(&hvnd_cookie->host_event); + +complete: + /* send out ioctl completion patcket */ + if (desc->flags & VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED) { + int retry = 5; + + while (true) { + int ret; + + ret = vmbus_sendpacket(dev->channel, NULL, 0, + requestid, VM_PKT_COMP, 0); + if (ret == 0) { + break; + } else if (ret == -EAGAIN) { + if (--retry == 0) { + hvnd_error("give up retrying send completion packet\n"); + break; + } + hvnd_warn("retrying send completion packet\n"); + udelay(100); + } else { + hvnd_error("unable to send completion packet ret=%d\n", + ret); + break; + } + } + } + +} diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/vmbus_rdma.h b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/vmbus_rdma.h new file mode 100644 index 000000000000..78e534fa8748 --- /dev/null +++ b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/vmbus_rdma.h @@ -0,0 +1,2202 @@ +/* + * Copyright (c) 2014, Microsoft Corporation. + * + * Author: + * K. Y. Srinivasan <kys@microsoft.com> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or + * NON INFRINGEMENT. See the GNU General Public License for more + * details. + * + * Bug fixes/enhancements: Long Li <longli@microsoft.com> + */ + + +#ifndef _VMBUS_RDMA_H +#define _VMBUS_RDMA_H + + +#include <linux/in.h> +#include <linux/in6.h> +#include <rdma/ib_verbs.h> +#include <linux/idr.h> +#include <linux/if_ether.h> + +/* NetworkDirect version Numbers. + */ +#define ND_VERSION_1 0x1 +#define ND_VERSION_2 0x20000 + +#ifndef NDVER +#define NDVER ND_VERSION_2 +#endif + +#define ND_ADAPTER_FLAG_IN_ORDER_DMA_SUPPORTED 0x00000001 +#define ND_ADAPTER_FLAG_CQ_INTERRUPT_MODERATION_SUPPORTED 0x00000004 +#define ND_ADAPTER_FLAG_MULTI_ENGINE_SUPPORTED 0x00000008 +#define ND_ADAPTER_FLAG_CQ_RESIZE_SUPPORTED 0x00000100 +#define ND_ADAPTER_FLAG_LOOPBACK_CONNECTIONS_SUPPORTED 0x00010000 + +#define ND_CQ_NOTIFY_ERRORS 0 +#define ND_CQ_NOTIFY_ANY 1 +#define ND_CQ_NOTIFY_SOLICITED 2 + +#define ND_MR_FLAG_ALLOW_LOCAL_WRITE 0x00000001 +#define ND_MR_FLAG_ALLOW_REMOTE_READ 0x00000002 +#define ND_MR_FLAG_ALLOW_REMOTE_WRITE 0x00000005 +#define ND_MR_FLAG_RDMA_READ_SINK 0x00000008 +#define ND_MR_FLAG_DO_NOT_SECURE_VM 0x80000000 + +#define ND_OP_FLAG_SILENT_SUCCESS 0x00000001 +#define ND_OP_FLAG_READ_FENCE 0x00000002 +#define ND_OP_FLAG_SEND_AND_SOLICIT_EVENT 0x00000004 +#define ND_OP_FLAG_ALLOW_READ 0x00000008 +#define ND_OP_FLAG_ALLOW_WRITE 0x00000010 + +#if NDVER >= ND_VERSION_2 +#define ND_OP_FLAG_INLINE 0x00000020 +#endif + +#define ND_AF_INET6 23 +#define IF_MAX_ADDR_LENGTH 32 + +struct group_affinity { + u64 mask; //KYS: usually 0 + u16 group; // KYS usually -1 + u16 reserved[3]; +}; + +struct if_physical_addr { + u16 length; + u8 addr[IF_MAX_ADDR_LENGTH]; +}; + +struct adapter_info_v2 { + u32 info_version; + u16 vendor_id; + u16 device_id; + u64 adapter_id; + size_t max_registration_size; + size_t max_window_size; + u32 max_initiator_sge; + u32 max_recv_sge; + u32 max_read_sge; + u32 max_transfer_length; + u32 max_inline_data_size; + u32 max_inbound_read_limit; + u32 max_outbound_read_limit; + u32 max_recv_q_depth; + u32 max_initiator_q_depth; + u32 max_shared_recv_q_depth; + u32 max_completion_q_depth; + u32 inline_request_threshold; + u32 large_request_threshold; + u32 max_caller_data; + u32 max_callee_data; + u32 adapter_flags; +} __packed; + +struct nd2_adapter_info_32 { //KYS: Check what this is + u32 info_version; + u16 vendor_id; + u16 devic_id; + u64 adapter_id; + u32 max_registration_size; + u32 max_window_size; + u32 max_initiator_sge; + u32 max_recv_sge; + u32 max_read_sge; + u32 max_transfer_length; + u32 max_inline_data_size; + u32 max_inbound_read_limit; + u32 max_outbound_read_limit; + u32 max_recv_q_depth; + u32 max_initiator_q_depth; + u32 max_shared_recv_q_depth; + u32 max_completion_q_depth; + u32 inline_request_threshold; + u32 large_request_threshold; + u32 max_caller_data; + u32 max_callee_data; + u32 adapter_flags; +} __packed; + +enum nd2_request_type { + ND2_RT_RECEIVE, + ND2_RT_SEND, + ND2_RT_BIND, + ND2_RT_INVALIDATE, + ND2_RT_READ, + ND2_RT_WRITE +}; + +struct nd2_result { + u32 status; + u32 bytes_transferred; + void *qp_ctx; + void *request_ctx; + enum nd2_request_type request_type; +} __packed; + +struct nd2_sge { + void *buffer; + u32 buffer_length; + u32 mr_token; +} __packed; + +/* + * The communication with the host via ioctls using VMBUS + * as the transport. + */ + +#define ND_IOCTL_VERSION 1 + +enum nd_mapping_type { + ND_MAP_IOSPACE, + ND_MAP_MEMORY, + ND_MAP_MEMORY_COALLESCE, + ND_MAP_PAGES, + ND_MAP_PAGES_COALLESCE, + ND_UNMAP_IOSPACE, + ND_UNMAP_MEMORY, + ND_MAX_MAP_TYPE +}; + +enum nd_caching_type { + ND_NON_CACHED = 0, + ND_CACHED, + ND_WRITE_COMBINED, + ND_MAX_CACHE_TYPE +}; + +enum nd_aceess_type { + ND_READ_ACCESS = 0, + ND_WRITE_ACCESS, + ND_MODIFY_ACCESS +}; + +struct nd_map_io_space { + enum nd_mapping_type map_type; + enum nd_caching_type cache_type; + u32 cb_length; +}; + +struct nd_map_memory { + enum nd_mapping_type map_type; + enum nd_aceess_type access_type; + u64 address; + u32 cb_length; +}; + +struct nd_mapping_id { + enum nd_mapping_type map_type; + u64 id; +}; + +struct ndk_map_pages { + struct nd_map_memory header; + u32 page_offset; +}; + +union nd_mapping { + enum nd_mapping_type map_type; + struct nd_map_io_space map_io_space; + struct nd_map_memory map_memory; + struct nd_mapping_id mapping_id; + struct ndk_map_pages map_pages; +}; + +struct nd_mapping_result { + u64 id; + u64 info; +}; + +struct nd_resource_descriptor { + u64 handle; + u32 ce_mapping_results; + u32 cb_mapping_results_offset; +}; + +struct nd_handle { + u32 version; + u32 reserved; + u64 handle; +}; + +union nd_sockaddr_inet { + struct sockaddr_in ipv4; + struct sockaddr_in6 ipv6; + u16 address_family; //KYS how is this supposed to work? +}; + +struct nd_address_element { + union nd_sockaddr_inet addr; + char mac_addr[ETH_ALEN]; +}; + +struct nd_resolve_address { + u32 version; + u32 reserved; + union nd_sockaddr_inet address; +}; + +struct nd_open_adapter { + u32 version; + u32 reserved; + u32 ce_mapping_cnt; + u32 cb_mapping_offset; + u64 adapter_id; +}; + +struct nd_adapter_query { + u32 version; + u32 info_version; + u64 adapter_handle; +}; + +struct nd_create_cq { + u32 version; + u32 queue_depth; + u32 ce_mapping_cnt; + u32 cb_mapping_offset; + u64 adapter_handle; + struct group_affinity affinity; +}; + +struct nd_create_srq { + u32 version; + u32 queue_depth; + u32 ce_mapping_cnt; + u32 cb_mapping_offset; + u32 max_request_sge; + u32 notify_threshold; + u64 pd_handle; + struct group_affinity affinity; +}; + +struct nd_create_qp_hdr { + u32 version; + u32 cb_max_inline_data; + u32 ce_mapping_cnt; + u32 cb_mapping_offset; //KYS: what is this prefix - ce/cb + u32 initiator_queue_depth; + u32 max_initiator_request_sge; + u64 receive_cq_handle; + u64 initiator_cq_handle; + u64 pd_handle; +}; + +struct nd_create_qp { + struct nd_create_qp_hdr hdr; + u32 receive_queue_depth; + u32 max_receive_request_sge; +}; + +struct nd_create_qp_with_srq { + struct nd_create_qp_hdr header; + u64 srq_handle; +}; + +struct nd_srq_modify { + u32 version; + u32 queue_depth; + u32 ce_mapping_cnt; + u32 cb_mapping_offset; + u32 notify_threshold; + u32 reserved; + u64 srq_handle; +}; + +struct nd_cq_modify { + u32 version; + u32 queue_depth; + u32 ce_mapping_count; + u32 cb_mappings_offset; + u64 cq_handle; +}; + +struct nd_cq_notify { + u32 version; + u32 type; + u64 cq_handle; +}; + +struct nd_mr_register_hdr { + u32 version; + u32 flags; + u64 cb_length; + u64 target_addr; + u64 mr_handle; +}; + +struct nd_mr_register { + struct nd_mr_register_hdr header; + u64 address; +}; + +struct nd_bind { + u32 version; + u32 reserved; + u64 handle; + union nd_sockaddr_inet address; +}; + +struct nd_read_limits { + u32 inbound; + u32 outbound; +}; + +struct nd_connect { + u32 version; + u32 reserved; + struct nd_read_limits read_limits; + u32 cb_private_data_length; + u32 cb_private_data_offset; + u64 connector_handle; + u64 qp_handle; + union nd_sockaddr_inet destination_address; + struct if_physical_addr phys_addr; +}; + +struct nd_accept { + u32 version; + u32 reserved; + struct nd_read_limits read_limits; + u32 cb_private_data_length; + u32 cb_private_data_offset; + u64 connector_handle; + u64 qp_handle; +}; + +struct nd_reject { + u32 version; + u32 reserved; + u32 cb_private_data_length; + u32 cb_private_data_offset; + u64 connector_handle; +}; + +struct nd_listen { + u32 version; + u32 back_log; + u64 listener_handle; +}; + +struct nd_get_connection_request { + u32 version; + u32 reserved; + u64 listener_handle; + u64 connector_handle; +}; + +enum ndv_mmio_type { + ND_PARTITION_KERNEL_VIRTUAL, + ND_PARTITION_SYSTEM_PHYSICAL, + ND_PARTITION_GUEST_PHYSICAL, + ND_MAXIMUM_MMIO_TYPE +}; + +struct ndv_resolve_adapter_id { + u32 version; + struct if_physical_addr phys_addr; +}; + +struct ndv_partition_create { + u32 version; + enum ndv_mmio_type mmio_type; + u64 adapter_id; + u64 xmit_cap; +}; + +struct ndv_partition_bind_luid { + u32 version; + u32 reserved; + u64 partition_handle; + struct if_physical_addr phys_addr; + //IF_LUID luid; //KYS? +}; + +struct ndv_partition_bind_address { + u32 version; + u32 reserved; + u64 partition_handle; + union nd_sockaddr_inet address; + struct if_physical_addr guest_phys_addr; + struct if_physical_addr phys_addr; +}; + +struct ndk_mr_register { + struct nd_mr_register_hdr hdr; + u32 cb_logical_page_addresses_offset; +}; + +struct ndk_bind { + struct nd_bind hdr; + u64 authentication_id; + bool is_admin; +}; + +#define FDN 0x12 +#define METHOD_BUFFERED 0x0 +#define FAA 0x0 + +#define CTL_CODE( DeviceType, Function, Method, Access ) ( \ + ((DeviceType) << 16) | ((Access) << 14) | ((Function) << 2) | (Method) \ +) + +#define ND_FUNCTION(r_, i_) ((r_) << 6 | (i_)) +#define IOCTL_ND(r_, i_) \ + CTL_CODE( FDN, ND_FUNCTION((r_), (i_)), METHOD_BUFFERED, FAA ) + +#define ND_FUNCTION_FROM_CTL_CODE(ctrlCode_) ((ctrlCode_ >> 2) & 0xFFF) +#define ND_RESOURCE_FROM_CTL_CODE(ctrlCode_) (ND_FUNCTION_FROM_CTL_CODE(ctrlCode_) >> 6) +#define ND_OPERATION_FROM_CTRL_CODE(ctrlCode_) (ND_FUNCTION_FROM_CTL_CODE(ctrlCode_) & 0x3F) + +#define ND_DOS_DEVICE_NAME L"\\DosDevices\\Global\\NetworkDirect" +#define ND_WIN32_DEVICE_NAME L"\\\\.\\NetworkDirect" + +enum nd_resource_type { + ND_PROVIDER = 0, + ND_ADAPTER, + ND_PD, + ND_CQ, + ND_MR, + ND_MW, + ND_SRQ, + ND_CONNECTOR, + ND_LISTENER, + ND_QP, + ND_VIRTUAL_PARTITION, + ND_RESOURCE_TYPE_COUNT +}; + +#define ND_OPERATION_COUNT 14 + +#define IOCTL_ND_PROVIDER(i_) IOCTL_ND(ND_PROVIDER, i_) +#define IOCTL_ND_ADAPTER(i_) IOCTL_ND(ND_ADAPTER, i_) +#define IOCTL_ND_PD(i_) IOCTL_ND(ND_PD, i_) +#define IOCTL_ND_CQ(i_) IOCTL_ND(ND_CQ, i_) +#define IOCTL_ND_MR(i_) IOCTL_ND(ND_MR, i_) +#define IOCTL_ND_MW(i_) IOCTL_ND(ND_MW, i_) +#define IOCTL_ND_SRQ(i_) IOCTL_ND(ND_SRQ, i_) +#define IOCTL_ND_CONNECTOR(i_) IOCTL_ND(ND_CONNECTOR, i_) +#define IOCTL_ND_LISTENER(i_) IOCTL_ND(ND_LISTENER, i_) +#define IOCTL_ND_QP(i_) IOCTL_ND(ND_QP, i_) +#define IOCTL_ND_VIRTUAL_PARTITION(i_) IOCTL_ND(ND_VIRTUAL_PARTITION, i_) + +/* Provider IOCTLs */ +#define IOCTL_ND_PROVIDER_INIT IOCTL_ND_PROVIDER( 0 ) +#define IOCTL_ND_PROVIDER_BIND_FILE IOCTL_ND_PROVIDER( 1 ) +#define IOCTL_ND_PROVIDER_QUERY_ADDRESS_LIST IOCTL_ND_PROVIDER( 2 ) +#define IOCTL_ND_PROVIDER_RESOLVE_ADDRESS IOCTL_ND_PROVIDER( 3 ) +#define IOCTL_ND_PROVIDER_MAX_OPERATION 4 + +/* Adapter IOCTLs */ +#define IOCTL_ND_ADAPTER_OPEN IOCTL_ND_ADAPTER( 0 ) +#define IOCTL_ND_ADAPTER_CLOSE IOCTL_ND_ADAPTER( 1 ) +#define IOCTL_ND_ADAPTER_QUERY IOCTL_ND_ADAPTER( 2 ) +#define IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST IOCTL_ND_ADAPTER( 3 ) +#define IOCTL_ND_ADAPTER_MAX_OPERATION 4 + +/* Protection Domain IOCTLs */ +#define IOCTL_ND_PD_CREATE IOCTL_ND_PD( 0 ) +#define IOCTL_ND_PD_FREE IOCTL_ND_PD( 1 ) +#define IOCTL_ND_PD_MAX_OPERATION 2 + +/* Completion Queue IOCTLs */ +#define IOCTL_ND_CQ_CREATE IOCTL_ND_CQ( 0 ) +#define IOCTL_ND_CQ_FREE IOCTL_ND_CQ( 1 ) +#define IOCTL_ND_CQ_CANCEL_IO IOCTL_ND_CQ( 2 ) +#define IOCTL_ND_CQ_GET_AFFINITY IOCTL_ND_CQ( 3 ) +#define IOCTL_ND_CQ_MODIFY IOCTL_ND_CQ( 4 ) +#define IOCTL_ND_CQ_NOTIFY IOCTL_ND_CQ( 5 ) +#define IOCTL_ND_CQ_MAX_OPERATION 6 + +/* Memory Region IOCTLs */ +#define IOCTL_ND_MR_CREATE IOCTL_ND_MR( 0 ) +#define IOCTL_ND_MR_FREE IOCTL_ND_MR( 1 ) +#define IOCTL_ND_MR_CANCEL_IO IOCTL_ND_MR( 2 ) +#define IOCTL_ND_MR_REGISTER IOCTL_ND_MR( 3 ) +#define IOCTL_ND_MR_DEREGISTER IOCTL_ND_MR( 4 ) +#define IOCTL_NDK_MR_REGISTER IOCTL_ND_MR( 5 ) +#define IOCTL_ND_MR_MAX_OPERATION 6 + +/* Memory Window IOCTLs */ +#define IOCTL_ND_MW_CREATE IOCTL_ND_MW( 0 ) +#define IOCTL_ND_MW_FREE IOCTL_ND_MW( 1 ) +#define IOCTL_ND_MW_MAX_OPERATION 2 + +/* Shared Receive Queue IOCTLs */ +#define IOCTL_ND_SRQ_CREATE IOCTL_ND_SRQ( 0 ) +#define IOCTL_ND_SRQ_FREE IOCTL_ND_SRQ( 1 ) +#define IOCTL_ND_SRQ_CANCEL_IO IOCTL_ND_SRQ( 2 ) +#define IOCTL_ND_SRQ_GET_AFFINITY IOCTL_ND_SRQ( 3 ) +#define IOCTL_ND_SRQ_MODIFY IOCTL_ND_SRQ( 4 ) +#define IOCTL_ND_SRQ_NOTIFY IOCTL_ND_SRQ( 5 ) +#define IOCTL_ND_SRQ_MAX_OPERATION 6 + +/* Connector IOCTLs */ +#define IOCTL_ND_CONNECTOR_CREATE IOCTL_ND_CONNECTOR( 0 ) +#define IOCTL_ND_CONNECTOR_FREE IOCTL_ND_CONNECTOR( 1 ) +#define IOCTL_ND_CONNECTOR_CANCEL_IO IOCTL_ND_CONNECTOR( 2 ) +#define IOCTL_ND_CONNECTOR_BIND IOCTL_ND_CONNECTOR( 3 ) +#define IOCTL_ND_CONNECTOR_CONNECT IOCTL_ND_CONNECTOR( 4 ) +#define IOCTL_ND_CONNECTOR_COMPLETE_CONNECT IOCTL_ND_CONNECTOR( 5 ) +#define IOCTL_ND_CONNECTOR_ACCEPT IOCTL_ND_CONNECTOR( 6 ) +#define IOCTL_ND_CONNECTOR_REJECT IOCTL_ND_CONNECTOR( 7 ) +#define IOCTL_ND_CONNECTOR_GET_READ_LIMITS IOCTL_ND_CONNECTOR( 8 ) +#define IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA IOCTL_ND_CONNECTOR( 9 ) +#define IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS IOCTL_ND_CONNECTOR( 10 ) +#define IOCTL_ND_CONNECTOR_GET_ADDRESS IOCTL_ND_CONNECTOR( 11 ) +#define IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT IOCTL_ND_CONNECTOR( 12 ) +#define IOCTL_ND_CONNECTOR_DISCONNECT IOCTL_ND_CONNECTOR( 13 ) +#define IOCTL_ND_CONNECTOR_MAX_OPERATION 14 + +/* Listener IOCTLs */ +#define IOCTL_ND_LISTENER_CREATE IOCTL_ND_LISTENER( 0 ) +#define IOCTL_ND_LISTENER_FREE IOCTL_ND_LISTENER( 1 ) +#define IOCTL_ND_LISTENER_CANCEL_IO IOCTL_ND_LISTENER( 2 ) +#define IOCTL_ND_LISTENER_BIND IOCTL_ND_LISTENER( 3 ) +#define IOCTL_ND_LISTENER_LISTEN IOCTL_ND_LISTENER( 4 ) +#define IOCTL_ND_LISTENER_GET_ADDRESS IOCTL_ND_LISTENER( 5 ) +#define IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST IOCTL_ND_LISTENER( 6 ) +#define IOCTL_ND_LISTENER_MAX_OPERATION 7 + +/* Queue Pair IOCTLs */ +#define IOCTL_ND_QP_CREATE IOCTL_ND_QP( 0 ) +#define IOCTL_ND_QP_CREATE_WITH_SRQ IOCTL_ND_QP( 1 ) +#define IOCTL_ND_QP_FREE IOCTL_ND_QP( 2 ) +#define IOCTL_ND_QP_FLUSH IOCTL_ND_QP( 3 ) +#define IOCTL_ND_QP_MAX_OPERATION 4 + +/* Kernel-mode only IOCTLs (IRP_MJ_INTERNAL_DEVICE_CONTROL) */ +#define IOCTL_NDV_PARTITION_RESOLVE_ADAPTER_ID IOCTL_ND_VIRTUAL_PARTITION( 0 ) +#define IOCTL_NDV_PARTITION_CREATE IOCTL_ND_VIRTUAL_PARTITION( 1 ) +#define IOCTL_NDV_PARTITION_FREE IOCTL_ND_VIRTUAL_PARTITION( 2 ) +#define IOCTL_NDV_PARTITION_BIND IOCTL_ND_VIRTUAL_PARTITION( 3 ) +#define IOCTL_NDV_PARTITION_UNBIND IOCTL_ND_VIRTUAL_PARTITION( 4 ) +#define IOCTL_NDV_PARTITION_BIND_LUID IOCTL_ND_VIRTUAL_PARTITION( 5 ) +#define IOCTL_NDV_PARTITION_MAX_OPERATION 6 + + +#define MB_SHIFT 20 + + +/* Ringbuffer size for the channel */ +#define NDV_NUM_PAGES_IN_RING_BUFFER 64 + +#define NDV_MAX_PACKETS_PER_RECEIVE 8 + +#define NDV_MAX_PACKET_COUNT 16304 + +#define NDV_MAX_NUM_OUTSTANDING_RECEIVED_PACKETS (16304) +#define NDV_MAX_HANDLE_TABLE_SIZE (16304) +#define NDV_HOST_MAX_HANDLE_TABLE_SIZE (NDV_MAX_HANDLE_TABLE_SIZE * 16) + + +#define NDV_MAX_MAPPINGS 4 + +#define NDV_STATE_NONE 0x00000000 +#define NDV_STATE_CREATED 0x00000001 +#define NDV_STATE_CONNECTING 0x00000002 +#define NDV_STATE_INITIALIZING 0x00000003 +#define NDV_STATE_OPERATIONAL 0xEFFFFFFF +#define NDV_STATE_FAILED 0xFFFFFFFF + + +#define NDV_MAX_PRIVATE_DATA_SIZE 64 +#define NDV_MAX_IOCTL_SIZE 256 + +/* max size of buffer for vector of ND_MAPPING */ +#define NDV_MAX_MAPPING_BUFFER_SIZE \ + (NDV_MAX_MAPPINGS * sizeof(union nd_mapping)) + +/* max expected ioctl buffer size from users */ +#define NDV_MAX_IOCTL_BUFFER_SIZE \ + (NDV_MAX_IOCTL_SIZE + \ + NDV_MAX_MAPPING_BUFFER_SIZE + \ + NDV_MAX_PRIVATE_DATA_SIZE) + +/* max PFN array for inline buffers */ +#define NDV_MAX_INLINE_PFN_ARRAY_LENGTH 32 + +/* Field header size for inline buffer */ +#define NDV_MAX_MAPPING_PACKET_FILED_BUFFER_SIZE \ + (NDV_MAX_MAPPINGS * sizeof(NDV_PACKET_FIELD)) + +/* Max for a single field */ + +#define NDV_MAX_SINGLE_MAPPING_FIELD ( sizeof(GPA_RANGE) + \ + (sizeof(PFN_NUMBER) * NDV_MAX_INLINE_PFN_ARRAY_LENGTH)) + +/* Max for all inine data */ + +#define NDV_MAX_MAPPING_DATA_SIZE (NDV_MAX_MAPPING_PACKET_FILED_BUFFER_SIZE + \ + (NDV_MAX_MAPPINGS * NDV_MAX_SINGLE_MAPPING_FIELD)) + + +#define NDV_MAX_PACKET_HEADER_SIZE 256 + +#define NDV_MAX_PACKET_SIZE (NDV_MAX_PACKET_HEADER_SIZE + \ + NDV_MAX_IOCTL_BUFFER_SIZE + \ + NDV_MAX_MAPPING_DATA_SIZE) + +/* Well known message type INIT is defined for the channel + * not for the protocol. + */ + +#define NDV_PACKET_TYPE_INIT 0xFFFFFFFF + +/* Invalid protocol version to to identify uninitialized channels */ + +#define NDV_PROTOCOL_VERSION_INVALID 0xFFFFFFFF + +/* Flags that control the bahavior of packet handling */ + +enum ndv_packet_options { + NDV_PACKET_OPTION_NONE = 0x00, + + /* Indicates that the ExternalDataMdl parameter is expectected to be + * passed and must be handled in the reciever. This call must be + * handled specially to ensure that the MDL can be created correctly. + */ + NDV_PACKET_OPTION_EXTERNAL_DATA = 0x01, + + /* Inicates that the reciever must execution the handler at passive. */ + NDV_PACKET_OPTIONS_REQUIRES_PASSIVE = 0x02, + + /* Indicates that the sender does not expect and is not waiting for a + * response packet. + */ + NDV_PACKET_OPTIONS_POST = 0x04, +}; + +#define NDV_PACKET_TYPE(id_, opt_) \ + (((opt_)<<24) | (id_)) + +#define NDV_PACKET_TYPE_OPTIONS(type_) \ + (((type_) >> 24) & 0xFF) + +#define NDV_PACKET_TYPE_ID(type_) \ + ((type_) & 0xFFFFFF) \ + +#define NDV_ADD_PACKET_OPTION(type_, opt_) \ + (type_) |= (opt_<<24) + +/* The header value sent on all packets */ +union ndv_packet_hdr { + + struct { + /* The type of packet. + * This value should be created with the NDV_PACKET_TYPE macro + * to include all packet options within the packet type. + */ + u32 packet_type; + /* The size of the entire fixed message structure that exists + * before the data. This must be >= sizeof(NDV_PACKET_HEADER) + */ + u32 hdr_sz; + /* This size of the data that follows the message + * data_sz + hdr_sz size gives the total size of + * the buffer that is used. + */ + u32 data_sz; + /* The status code used to indicate success or failure. + * It is only used in completions and during responses. + */ + u32 status; //KYS: NTSTATUS? + }; + + u64 padding[2]; //KYS: why? +}; + + +/* The core INIT packet. This message is defined in the channel + * not in the protocol. This message should never change size + * or behavior, as it could impact compatibility in the future. + * This packet is used to negotiate the protocol version, so chaning + * this size could break backward compat. + */ + +union ndv_packet_init { + struct { + u32 packet_type; + u32 protocol_version; + u32 flags; + }; + u64 padding[2]; +} __packed; + +#define NDV_PACKET_INIT_SIZE 16 + +/* Data packing flags used for accessing the dynamic fields inside a packet */ +#define NDV_DATA_PACKING_2 0x1 +#define NDV_DATA_PACKING_4 0x3 +#define NDV_DATA_PACKING_8 0x7 + + +#define NDV_PROTOCOL_VERSION_1 0x0100 +#define NDV_PROTOCOL_VERSION_CURRENT NDV_PROTOCOL_VERSION_1 +#define NDV_PROTOCOL_VERSION_COUNT 1 + +struct ndv_pkt_field { + u32 size; + u32 offset; +}; + +enum ndv_pkt_id { + NDV_PKT_UNKNOWN = 0, + /* Version 1 Message ID's */ + NDV_PKT_ID1_BIND, + NDV_PKT_ID1_CREATE, + NDV_PKT_ID1_CLEANUP, + NDV_PKT_ID1_CANCEL, + NDV_PKT_ID1_CONTROL, + NDV_PKT_ID1_COMPLETE, + NDV_PKT_ID1_INIT_RESOURCES, +}; + +/* The guest will send this as the first messages just after init + * The resources are reserved per channel. + */ + +struct ndv_pkt_hdr_init_resources_1 { + + union ndv_packet_hdr pkt_hdr; + u16 io_space_sz_mb; + u64 io_space_start; + +}; + + + +/* The guest will send this packet to the host after channel init + * to query support for the adapters that are registered. + */ + +struct ndv_pkt_hdr_bind_1 { + union ndv_packet_hdr pkt_hdr; + bool unbind; + union nd_sockaddr_inet ip_address; + struct if_physical_addr phys_addr; + u64 guest_id; +}; + +union ndv_context_handle { + u64 val64; + struct { + u32 local; + u32 remote; + }; +}; + +struct ndv_pkt_hdr_create_1 { + union ndv_packet_hdr pkt_hdr; + + /* Identifies the object used to track this file handle on both + * the guest and the host. When sent from the guest, it will contain + * the guest handle. On success, the host will populate and return + * it's handle value as well. + */ + + union ndv_context_handle handle; + + /* The parameters sent to the CreateFile call */ + u32 access_mask; + u32 open_options; + + u16 file_attributes; //KYS: This field must be 64 bit aligned + + u16 share_access; //KYS + + u32 kys_padding; //KYS + + u16 ea_length; //KYS; needs to be 64 bit aligned; what is ea length - unused +}; + + +struct ndv_pkt_hdr_cleanup_1 { + union ndv_packet_hdr pkt_hdr; + + /* Identifies the object used to track this file handle on both + * the guest and the host. When sent from the guest, it will contain + * the both the guest and host handle values. The host will use this + * value to cleanup its resource, then update its portion of the handle + * to NDV_HANDLE_NULL before returning the data back to the guest. + */ + union ndv_context_handle handle; +}; + +struct ndv_pkt_hdr_cancel_1 { + union ndv_packet_hdr pkt_hdr; + union ndv_context_handle file_handle; + union ndv_context_handle irp_handle; +}; + +struct ndv_bind_port_info { + //LUID authentication_id; //KYS: LUID? + bool is_admin; +}; + +struct ndv_extended_data_flds { + union { + u32 field_count; + u64 padding; + }; + + //struct ndv_pkt_field fields[ANYSIZE_ARRAY]; //KYS? +}; + + +struct ndv_packet_hdr_control_1 { + union ndv_packet_hdr pkt_hdr; + /* Identifies the object used to track this file handle on both + * the guest and the host. This should always have both guest + * and host handle values inside it. + */ + + union ndv_context_handle file_handle; + + /* The handle information for the allocated irp context object. + * This information is used when the host/guest starts the cancelation + */ + union ndv_context_handle irp_handle; + + /* The input data describing in the IO control parameters */ + + u32 io_cntrl_code; + u32 output_buf_sz; + u32 input_buf_sz; + u32 input_output_buf_offset; + + /* These are used in the return message to indicate the status of the IO + * operation and the amount of data written to the output buffer. + */ + u32 io_status; //KYS: NTSTATUS? + u32 bytes_returned; + + /* This contains the field information for additional data that is sent + * with the packet that is IOCTL specific. + */ + + struct ndv_pkt_field extended_data; +}; + +/* + * Include MLX specific defines. + */ + +#include "mx_abi.h" + +/* Driver specific state. + */ + +/* + * We need to have host open a file; some + * Windows constants for open. + */ +#define STANDARD_RIGHTS_ALL (0x001F0000L) +#define FILE_ATTRIBUTE_NORMAL (0x80) +#define FILE_SHARE_READ (0x00000001) +#define FILE_SHARE_WRITE (0x00000002) +#define FILE_SHARE_DELETE (0x00000004) +#define FILE_FLAG_OVERLAPPED (0x40000000) +#define FILE_SHARE_ALL (FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE) +#define CREATE_ALWAYS (2) +#define OPEN_EXISTING (3) + +#define RTL_NUMBER_OF(_x) \ + sizeof(_x)/sizeof(_x[0]) +/* + * The context structure tracks the open state. + */ + +/* + * Packet layout for open adaptor. + */ + +/* + * Packet for querying the address list. + */ + +union query_addr_list_ioctl { + struct nd_handle in; + union nd_sockaddr_inet out[16]; //KYS a max of 16 addresses +}; + +struct pkt_query_addr_list { + struct ndv_packet_hdr_control_1 hdr; + union query_addr_list_ioctl ioctl; + unsigned long activity_id; +}; + + +struct pkt_fld { + u32 size; + u32 offset; +}; + +struct fld_data { + union { + u64 padding; + }; +}; + +struct extended_data_oad { + union { + u32 cnt; + u64 padding; + }; + /* offsets are from start of extended data struct + * and should start on 8 byte boundary + */ + struct pkt_fld fields[IBV_GET_CONTEXT_MAPPING_MAX]; +}; + +union oad_ioctl { + struct nd_open_adapter input; + struct nd_resource_descriptor resrc_desc; +}; + +union oad_mappings { + struct ibv_get_context_req ctx_input; + struct ibv_get_context_resp ctx_output; +}; + +struct pkt_nd_open_adapter { + struct ndv_packet_hdr_control_1 hdr; + + union oad_ioctl ioctl; + union oad_mappings mappings; + + /* + * Extended data. + */ + struct extended_data_oad ext_data; +}; + +/* + * Create CQ IOCTL. + */ + +struct cq_db_gpa { + u32 byte_count; + u32 byte_offset; + u64 pfn_array[2]; +}; + +struct cq_sn_gpa { + u32 byte_count; + u32 byte_offset; + u64 pfn_array[2]; +}; + +struct create_cq_ext_data { + union { + u32 cnt; + u64 padding; + }; + /* offsets are from start of extended data struct + * and should start on 8 byte boundary + */ + struct pkt_fld fields[MLX4_IB_CREATE_CQ_MAPPING_MAX]; + struct cq_db_gpa db_gpa; + struct cq_sn_gpa sn_gpa; + struct gpa_range cqbuf_gpa; +}; + +union create_cq_ioctl { + struct nd_create_cq input; + struct nd_resource_descriptor resrc_desc; +}; + +union create_cq_mappings { + struct ibv_create_cq cq_in; + struct ibv_create_cq_resp cq_resp; +}; + +struct pkt_nd_create_cq { + struct ndv_packet_hdr_control_1 hdr; + + union create_cq_ioctl ioctl; + union create_cq_mappings mappings; + + /* + * Extended data. + */ + struct create_cq_ext_data ext_data; +}; + +/* + * IOCTL to free CQ. + */ +struct free_cq_ioctl { + struct nd_handle in; +}; + +struct pkt_nd_free_cq { + struct ndv_packet_hdr_control_1 hdr; + + struct free_cq_ioctl ioctl; +}; + + +/* + * IOCTL to QUERY CQ - CQ NOTIFY + */ + +struct notify_cq_ioctl { + struct nd_cq_notify in; +}; + +struct pkt_nd_notify_cq { + struct ndv_packet_hdr_control_1 hdr; + struct notify_cq_ioctl ioctl; +}; + +/* + * IOCTL to Create a listner + */ + +struct nd_ep_create { + struct nd_handle hdr; + bool to_semantics; + unsigned long activity_id; +}; + +union listener_cr_ioctl { + struct nd_ep_create in; + u64 out; +}; + +struct pkt_nd_cr_listener { + struct ndv_packet_hdr_control_1 hdr; + union listener_cr_ioctl ioctl; +}; + +/* + * IOCTL to free listener. + */ + +struct listener_free_ioctl { + struct nd_handle in; +}; + +struct pkt_nd_free_listener { + struct ndv_packet_hdr_control_1 hdr; + struct listener_free_ioctl ioctl; +}; + +/* + * IOCTL for listener cancel IO. + */ +struct listener_cancelio_ioctl { + struct nd_handle in; +}; + +struct pkt_nd_cancelio_listener { + struct ndv_packet_hdr_control_1 hdr; + struct listener_cancelio_ioctl ioctl; +}; + +/* + * IOCTL for LISTENER BIND + */ + +union listener_bind_ioctl { + struct ndk_bind in; +}; + +struct pkt_nd_bind_listener { + struct ndv_packet_hdr_control_1 hdr; + union listener_bind_ioctl ioctl; +}; + +/* + * After the listener is bound, enable + * listening. + */ + +union listener_listen_ioctl { + struct nd_listen in; +}; + +struct pkt_nd_listen_listener { + struct ndv_packet_hdr_control_1 hdr; + union listener_listen_ioctl ioctl; +}; + +/* + * IOCTL for getting the adddress from listener. + * + */ + +union listener_get_addr_ioctl { + struct nd_handle in; + union nd_sockaddr_inet out; +}; + +struct pkt_nd_get_addr_listener { + struct ndv_packet_hdr_control_1 hdr; + union listener_get_addr_ioctl ioctl; +}; + +/* + * IOCTL to get a connection from a listener. + */ + +union listener_get_connection_ioctl { + struct nd_get_connection_request in; + union nd_sockaddr_inet out; +}; + +struct pkt_nd_get_connection_listener { + struct ndv_packet_hdr_control_1 hdr; + union listener_get_connection_ioctl ioctl; +}; + + +/* + * Connector IOCTLs + */ + +/* + * IOCTL to create connector. + */ + +union connector_cr_ioctl { //KYS should this be a union or struct? + struct nd_ep_create in; + u64 out; +}; + +struct pkt_nd_cr_connector { + struct ndv_packet_hdr_control_1 hdr; + union connector_cr_ioctl ioctl; //KYS: union or struct +}; + +/* + * IOCTL to free connector. + */ + +struct connector_free_ioctl { + struct nd_handle in; +}; + +struct pkt_nd_free_connector { + struct ndv_packet_hdr_control_1 hdr; + struct connector_free_ioctl ioctl; +}; + +/* + * IOCTL to cancel I/O on a connector. + */ + +struct connector_cancelio_ioctl { + struct nd_handle in; +}; + +struct pkt_nd_cancelio_connector { + struct ndv_packet_hdr_control_1 hdr; + struct connector_cancelio_ioctl ioctl; +}; + +/* + * IOCTL to Bind an address to the connector. + */ + +union connector_bind_ioctl { + struct ndk_bind in; +}; + +struct pkt_nd_bind_connector { + struct ndv_packet_hdr_control_1 hdr; + union connector_bind_ioctl ioctl; +}; + +/* + * IOCTL to connect a connector. + */ + +struct connector_connect_in { + struct nd_connect hdr; + u8 retry_cnt; + u8 rnr_retry_cnt; + u8 priv_data[56]; + unsigned long activity_id; +}; + +union connector_connect_ioctl { + struct connector_connect_in in; +}; + +struct pkt_nd_connector_connect { + struct ndv_packet_hdr_control_1 hdr; + union connector_connect_ioctl ioctl; +}; + +/* + * IOCTL for connector complete connect + */ + +struct complete_connect_in { + struct nd_handle hdr; + u8 rnr_nak_to; + unsigned long activity_id; +}; + +struct complete_connect_out { + enum ibv_qp_state state; +}; + +union connector_complete_connect_ioctl { + struct complete_connect_in in; + struct complete_connect_out out; +}; + +struct pkt_nd_connector_connect_complete { + struct ndv_packet_hdr_control_1 hdr; + union connector_complete_connect_ioctl ioctl; +}; + + +#define MAX_PRIVATE_DATA_LEN 148 + +/* + * IOCTL for connector accept. + */ + +struct connector_accept_in { + struct nd_accept hdr; + u8 rnr_retry_cnt; + u8 rnr_nak_to; + u8 private_data[MAX_PRIVATE_DATA_LEN]; + unsigned long activity_id; +}; + +struct connector_accept_out { + enum ibv_qp_state state; +}; + +union connector_accept_ioctl { + struct connector_accept_in in; + struct connector_accept_out out; +}; + +struct pkt_nd_connector_accept { + struct ndv_packet_hdr_control_1 hdr; + union connector_accept_ioctl ioctl; +}; + +/* + * IOCTL for connector to reject a connection. + */ + +struct connector_reject_in { + struct nd_reject hdr; + u8 private_data[MAX_PRIVATE_DATA_LEN]; +}; + +struct connector_reject_out { + enum ibv_qp_state state; +}; + +union connector_reject_ioctl { + struct connector_reject_in in; + struct connector_reject_out out; +}; + +struct pkt_nd_connector_reject { + struct ndv_packet_hdr_control_1 hdr; + union connector_reject_ioctl ioctl; +}; + +/* + * IOCTL to get connector read limits. + */ + +struct connector_get_rd_limits_in { + struct nd_handle in; +}; + +struct connector_get_rd_limits_out { + struct nd_read_limits out; +}; + +union connector_get_rd_limits_ioctl { + struct connector_get_rd_limits_in in; + struct connector_get_rd_limits_out out; +}; + +struct pkt_nd_connector_get_rd_limits { + struct ndv_packet_hdr_control_1 hdr; + union connector_get_rd_limits_ioctl ioctl; +}; + +/* + * IOCTL to get connector private data. + */ +union connector_get_priv_data_ioctl { + struct nd_handle in; + u8 out[MAX_PRIVATE_DATA_LEN]; +}; + +struct pkt_nd_connector_get_priv_data { + struct ndv_packet_hdr_control_1 hdr; + union connector_get_priv_data_ioctl ioctl; +}; + + +/* + * IOCTL get peer address. + */ + +union connector_get_peer_addr_ioctl { + struct nd_handle in; + union nd_sockaddr_inet out; +}; + +struct pkt_nd_connector_get_peer_addr { + struct ndv_packet_hdr_control_1 hdr; + union connector_get_peer_addr_ioctl ioctl; +}; + +/* + * IOCTL to get connector address. + */ + +union connector_get_addr_ioctl { + struct nd_handle in; + union nd_sockaddr_inet out; +}; + +struct pkt_nd_connector_get_addr { + struct ndv_packet_hdr_control_1 hdr; + union connector_get_addr_ioctl ioctl; +}; + +/* + * IOCTL for disconnect notification. + */ + +union connector_notify_disconnect_ioctl { + struct nd_handle in; +}; + +struct pkt_nd_connector_notify_disconnect { + struct ndv_packet_hdr_control_1 hdr; + union connector_notify_disconnect_ioctl ioctl; +}; + +union connector_disconnect_ioctl { + struct nd_handle in; +}; + +struct pkt_nd_connector_disconnect { + struct ndv_packet_hdr_control_1 hdr; + union connector_notify_disconnect_ioctl ioctl; +}; + +/* + * IOCTLs for QP operations. + */ + +/* + * Create qp IOCTL. + */ + +struct qp_db_gpa { + u32 byte_count; + u32 byte_offset; + u64 pfn_array[1]; +}; + +struct create_qp_ext_data { + union { + u32 cnt; + u64 padding; + }; + /* offsets are from start of extended data struct + * and should start on 8 byte boundary + */ + struct pkt_fld fields[MLX4_IB_CREATE_QP_MAPPINGS_MAX]; + struct qp_db_gpa db_gpa; + struct gpa_range qpbuf_gpa; +}; + +union create_qp_ioctl { + struct nd_create_qp input; + struct nd_resource_descriptor resrc_desc; +}; + +union create_qp_mappings { + struct ibv_create_qp qp_in; + struct ibv_create_qp_resp qp_resp; +}; + +struct pkt_nd_create_qp { + struct ndv_packet_hdr_control_1 hdr; + + union create_qp_ioctl ioctl; + union create_qp_mappings mappings; + + /* + * Extended data. + */ + struct create_qp_ext_data ext_data; +}; + +/* + * IOCTL to flush a QP. + */ +struct flush_qp_ioctl { + struct nd_handle in; + enum ibv_qp_state out; +}; + +struct pkt_nd_flush_qp { + struct ndv_packet_hdr_control_1 hdr; + struct flush_qp_ioctl ioctl; +}; + +/* + * Memory Region IOCTLS + */ +union create_mr_ioctl { + struct nd_handle in; + u64 out; +}; + +struct pkt_nd_create_mr { + struct ndv_packet_hdr_control_1 hdr; + union create_mr_ioctl ioctl; +}; + +struct mr_out { + u32 lkey; + u32 rkey; + unsigned long activity_id; +}; + + +union register_mr_ioctl { + struct nd_mr_register in; + struct mr_out out; +}; + +struct pkt_nd_register_mr { + struct ndv_packet_hdr_control_1 hdr; + union register_mr_ioctl ioctl; +}; + +struct deregister_mr_ioctl { + struct nd_handle in; +}; + +struct pkt_nd_deregister_mr { + struct ndv_packet_hdr_control_1 hdr; + struct deregister_mr_ioctl ioctl; +}; + +/* + * IOCTL to disconnect connector + */ + +/* + * Create PD IOCTL. + */ +struct nd_create_pd_ioctl { + union { + struct nd_handle in; + u64 out_handle; + }; + struct ibv_alloc_pd_resp resp; +}; + +struct pkt_nd_pd_create { + struct ndv_packet_hdr_control_1 hdr; + struct nd_create_pd_ioctl ioctl; +}; + +/* + * Free Handle. Check the layout with Luke. + * + */ +struct free_handle_ioctl { + struct nd_handle in; +}; + +struct pkt_nd_free_handle { + struct ndv_packet_hdr_control_1 hdr; + struct free_handle_ioctl ioctl; +}; + +/* + * Cancel I/O. + */ + +struct cancel_io_ioctl { + struct nd_handle in; +}; + +struct pkt_nd_cancel_io { + struct ndv_packet_hdr_control_1 hdr; + struct cancel_io_ioctl ioctl; +}; + +/* + * Connector states: + */ + +enum connector_state { + HVND_CON_INCOMING, + HVND_CON_INCOMING_ESTABLISHED, + HVND_CON_INCOMING_REJECTED, + HVND_CON_OUTGOING_REQUEST +}; + + +/* + * Adaptor query IOCTL. + */ +struct nd_adap_query_ioctl { + union { + struct nd_adapter_query ad_q; + struct adapter_info_v2 ad_info; + }; +}; + +struct pkt_nd_query_adaptor { + struct ndv_packet_hdr_control_1 hdr; + struct nd_adap_query_ioctl ioctl; +}; + +struct nd_ioctl { + union { + struct nd_handle handle; + u8 raw_buffer[NDV_MAX_IOCTL_BUFFER_SIZE]; + }; +}; + +struct pkt_nd_provider_ioctl { + struct ndv_packet_hdr_control_1 hdr; + struct nd_ioctl ioctl; +}; + +struct hvnd_ib_pd { + struct ib_pd ibpd; + u32 pdn; + u64 handle; +}; + +struct hvnd_work { + struct work_struct work; + void *callback_arg; +}; + +struct hvnd_disconnect_work { + struct work_struct work; + int status; + void *callback_arg; +}; + +/* +struct hvnd_delayed_work { + struct delayed_work work; + void *callback_arg; +}; +*/ + +enum hvnd_cm_state { + hvnd_cm_idle = 0, + hvnd_cm_connect_reply_sent, //active + hvnd_cm_connect_reply_refused, + hvnd_cm_connect_received, //active + hvnd_cm_connect_request_sent, //passive + hvnd_cm_accept_sent, + hvnd_cm_close_sent, + hvnd_cm_established_sent, +}; + +struct incoming_pkt { + struct list_head list_entry; + char pkt[0]; +}; + +struct hvnd_ep_obj { +/* + spinlock_t ep_lk; + bool to_be_destroyed; + bool io_outstanding; + + wait_queue_head_t wait; + bool stopped; + atomic_t process_refcnt; // how many NDV_PKT_ID1_COMPLETE packets we are currently processing +*/ + bool stopping; + wait_queue_head_t wait_pending; + atomic_t nr_requests_pending; + + enum nd_resource_type type; + enum connector_state state; //KYS need to look at locking + struct iw_cm_id *cm_id; + enum hvnd_cm_state cm_state; + struct completion block_event; + struct completion disconnect_event; + struct completion connector_accept_event; + int connector_accept_status; + u64 ep_handle; + spinlock_t incoming_pkt_list_lock; + struct list_head incoming_pkt_list; + struct hvnd_ep_obj *parent; + struct hvnd_dev *nd_dev; + struct hvnd_ucontext *uctx; + struct hvnd_work wrk; + struct hvnd_cq *cq; + u8 ord; + u8 ird; + char priv_data[MAX_PRIVATE_DATA_LEN]; + bool incoming; + atomic_t disconnect_notified; + u64 outstanding_handle; + u32 local_irp; + struct hvnd_ep_obj *outstanding_ep; + struct pkt_nd_connector_connect connector_connect_pkt; + int connector_connect_retry; +}; + +struct hvnd_ucontext { + struct ib_ucontext ibucontext; + struct list_head listentry; + struct ndv_pkt_hdr_create_1 create_pkt; + struct ndv_pkt_hdr_create_1 create_pkt_ovl; /* Overlap handle */ + struct pkt_nd_provider_ioctl pr_init_pkt; + union ndv_context_handle file_handle; + union ndv_context_handle file_handle_ovl; + + struct pkt_nd_open_adapter o_adap_pkt; + + u64 adaptor_hdl; + + /* + * Protection domain state. + */ + struct pkt_nd_pd_create pd_cr_pkt; + + u64 uar_base; + u64 bf_base; + u32 bf_buf_size; + u32 bf_offset; + u32 cqe_size; + u32 max_qp_wr; + u32 max_sge; + u32 max_cqe; + u32 num_qps; + + /* + * State to manage dorbell pages: + */ + struct list_head db_page_list; + struct mutex db_page_mutex; + + atomic_t refcnt; + +}; + +struct hvnd_dev { + struct ib_device ibdev; + struct hv_device *hvdev; + u32 device_cap_flags; + unsigned char nports; + bool ib_active; + + /* State to manage interaction with the host. + */ + + spinlock_t uctxt_lk; + struct list_head listentry; + + unsigned long mmio_sz; + unsigned long mmio_start_addr; + struct resource mmio_resource; + void *mmio_virt; + + unsigned long negotiated_version; + union ndv_packet_init init_pkt; + struct ndv_pkt_hdr_init_resources_1 resources; + struct ndv_pkt_hdr_bind_1 bind_pkt; + + struct ndv_pkt_hdr_create_1 global_create_pkt; + union ndv_context_handle global_file_handle; + + struct semaphore query_pkt_sem; + bool query_pkt_set; + struct pkt_nd_query_adaptor query_pkt; + + /* + * ID tables. + */ + spinlock_t id_lock; + + struct idr cqidr; + struct idr qpidr; + struct idr mmidr; + struct idr irpidr; + struct idr uctxidr; + atomic_t open_cnt; + + char ip_addr[4]; + char mac_addr[6]; + struct completion addr_set; + int bind_complete; + struct mutex bind_mutex; +}; + +struct hvnd_cq { + struct ib_cq ibcq; + void *cq_buf; + void *db_addr; + u32 arm_sn; + u32 entries; + + u32 cqn; + u32 cqe; + u64 cq_handle; + + struct ib_umem *umem; + struct ib_umem *db_umem; + struct mlx4_ib_user_db_page user_db_page; + struct hvnd_ucontext *uctx; + struct hvnd_ep_obj ep_object; //KYS need to clean this up; have a cq irp state + bool monitor; + bool upcall_pending; +}; + +struct hvnd_qp { + struct ib_qp ibqp; + void *qp_buf; + void *db_addr; + u32 buf_size; + u8 port; + struct hvnd_dev *nd_dev; + + __u8 log_sq_bb_count; + __u8 log_sq_stride; + __u8 sq_no_prefetch; + + int rq_wqe_cnt; + int rq_wqe_shift; + int rq_max_gs; + + int sq_wqe_cnt; + int sq_wqe_shift; + int sq_max_gs; + + u32 max_inline_data; + + u32 initiator_q_depth; + u32 initiator_request_sge; + + u32 receive_q_depth; + u32 receive_request_sge; + + struct hvnd_cq *recv_cq; + struct hvnd_cq *send_cq; + + u64 receive_cq_handle; + u64 initiator_cq_handle; + u64 pd_handle; + + u64 qp_handle; + u32 qpn; + u32 max_send_wr; + u32 max_recv_wr; + u32 max_send_sge; + u32 max_recv_sge; + + struct ib_umem *umem; + struct ib_umem *db_umem; + struct mlx4_ib_user_db_page user_db_page; + struct hvnd_ucontext *uctx; + struct iw_cm_id *cm_id; + + /* + * Current QP state; need to look at locking. + * XXXKYS + */ + enum ib_qp_state qp_state; + bool cq_notify; + wait_queue_head_t wait; + atomic_t refcnt; + struct hvnd_ep_obj *connector; +}; + +struct hvnd_mr { + struct ib_mr ibmr; + struct hvnd_ib_pd *pd; + struct ib_umem *umem; + u64 start; + u64 length; + u64 virt; + int acc; + u64 mr_handle; + u32 mr_lkey; + u32 mr_rkey; +}; + +struct hvnd_cookie { + struct completion host_event; + void *pkt; +}; + +/* + * Definitions to retrieve the IP address. + */ + +#define HVND_CURRENT_VERSION 0 + +struct hvnd_ipaddr_tuple { + char mac_address[ETH_ALEN]; + struct sockaddr addr; +}; + +struct hvnd_msg { + int status; + struct hvnd_ipaddr_tuple ip_tuple; +}; + +static inline struct hvnd_ib_pd *to_nd_pd(struct ib_pd *pd) +{ + return container_of(pd, struct hvnd_ib_pd, ibpd); +} + +static inline struct hvnd_dev *to_nd_dev(struct ib_device *ibdev) +{ + return container_of(ibdev, struct hvnd_dev, ibdev); +} + +static inline struct hvnd_cq *to_nd_cq(struct ib_cq *ibcq) +{ + return container_of(ibcq, struct hvnd_cq, ibcq); +} + +static inline struct hvnd_qp *to_nd_qp(struct ib_qp *ibqp) +{ + return container_of(ibqp, struct hvnd_qp, ibqp); +} + +static inline struct hvnd_ucontext *to_nd_context(struct ib_ucontext *ibucontext) +{ + return container_of(ibucontext, struct hvnd_ucontext, ibucontext); +} + +static inline struct hvnd_ucontext *get_uctx_from_pd(struct ib_pd *pd) +{ + return to_nd_context(pd->uobject->context); +} + +static inline struct hvnd_mr *to_nd_mr(struct ib_mr *ibmr) +{ + return container_of(ibmr, struct hvnd_mr, ibmr); +} +/* + * ID management. + */ + +static inline int insert_handle(struct hvnd_dev *dev, struct idr *idr, + void *handle, u32 id) +{ + int ret; + unsigned long flags; + + idr_preload(GFP_KERNEL); + spin_lock_irqsave(&dev->id_lock, flags); + + ret = idr_alloc(idr, handle, id, id + 1, GFP_ATOMIC); + + spin_unlock_irqrestore(&dev->id_lock, flags); + idr_preload_end(); + + BUG_ON(ret == -ENOSPC); + return ret < 0 ? ret : 0; +} + +static inline void remove_handle(struct hvnd_dev *dev, struct idr *idr, u32 id) +{ + unsigned long flags; + + spin_lock_irqsave(&dev->id_lock, flags); + idr_remove(idr, id); + spin_unlock_irqrestore(&dev->id_lock, flags); +} + +static inline struct hvnd_cq *get_cqp(struct hvnd_dev *dev, u32 cqid) +{ + struct hvnd_cq *cqp; + unsigned long flags; + + spin_lock_irqsave(&dev->id_lock, flags); + cqp = idr_find(&dev->cqidr, cqid); + spin_unlock_irqrestore(&dev->id_lock, flags); + + return cqp; +} + +static inline struct hvnd_qp *get_qpp(struct hvnd_dev *dev, u32 qpid) +{ + struct hvnd_qp *qpp; + unsigned long flags; + + spin_lock_irqsave(&dev->id_lock, flags); + qpp = idr_find(&dev->qpidr, qpid); + spin_unlock_irqrestore(&dev->id_lock, flags); + + return qpp; +} + +static inline struct hvnd_ucontext *get_uctx(struct hvnd_dev *dev, u32 pid) +{ + struct hvnd_ucontext *uctx; + unsigned long flags; + + spin_lock_irqsave(&dev->id_lock, flags); + uctx = idr_find(&dev->uctxidr, pid); + spin_unlock_irqrestore(&dev->id_lock, flags); + + return uctx; +} + + +static inline void *map_irp_to_ctx(struct hvnd_dev *nd_dev, u32 irp) +{ + void *ctx; + unsigned long flags; + + spin_lock_irqsave(&nd_dev->id_lock, flags); + ctx = idr_find(&nd_dev->irpidr, irp); + spin_unlock_irqrestore(&nd_dev->id_lock, flags); + + return ctx; +} + + + +void hvnd_callback(void *context); +int hvnd_negotiate_version(struct hvnd_dev *nd_dev); +int hvnd_init_resources(struct hvnd_dev *nd_dev); +int hvnd_bind_nic(struct hvnd_dev *nd_dev, bool un_bind, char *ip_addr, char *mac_addr); +int hvnd_open_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx); +int hvnd_close_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx); +int hvnd_query_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx); +int hvnd_create_pd(struct hvnd_ucontext *uctx, struct hvnd_dev *nd_dev, + struct hvnd_ib_pd *hvnd_pd); + +/* + * CQ operations. + */ +int hvnd_create_cq(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + struct hvnd_cq *cq); +int hvnd_destroy_cq(struct hvnd_dev *nd_dev, struct hvnd_cq *cq); +int hvnd_notify_cq(struct hvnd_dev *nd_dev, struct hvnd_cq *cq, + u32 notify_type, u64 irp_handle); + +/* + * QP operations. + */ +int hvnd_create_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + struct hvnd_qp *qp); + +int hvnd_free_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + struct hvnd_qp *qp); + +int hvnd_flush_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + struct hvnd_qp *qp); + +/* + * MR operations. + */ + +int hvnd_cr_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 pd_handle, u64 *mr_handle); + +int hvnd_free_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 handle); + +int hvnd_mr_register(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + struct hvnd_mr *mr); +int hvnd_deregister_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 handle); + +/* + * Listner operations + */ +int hvnd_cr_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, u64 *handle); + + +int hvnd_free_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 listener_handle); + +int hvnd_bind_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 listener_handle, union nd_sockaddr_inet *addr); + +int hvnd_listen_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 listener_handle, u32 backlog); + +int hvnd_get_addr_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 listener_handle, union nd_sockaddr_inet *addr); + +int hvnd_get_connection_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 listener_handle, u64 connector_handle, + u64 irp_handle); + +/* + * Connector operations. + */ +int hvnd_cr_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 *connector_handle); + +int hvnd_free_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 handle); + +int hvnd_cancelio_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 handle); +int hvnd_bind_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 handle, union nd_sockaddr_inet *addr); + +int hvnd_connector_connect(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 connector_handle, u32 in_rd_limit, u32 out_rd_limit, + u32 priv_data_length, const u8 *priv_data, + u64 qp_handle, struct if_physical_addr *phys_addr, + union nd_sockaddr_inet *dest_addr, struct hvnd_ep_obj *ep); + +int hvnd_connector_complete_connect(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 connector_handle, enum ibv_qp_state *qp_state); + +int hvnd_connector_accept(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 connector_handle, + u64 qp_handle, + u32 in_rd_limit, u32 out_rd_limit, + u32 priv_data_length, const u8 *priv_data, + enum ibv_qp_state *qp_state, struct hvnd_ep_obj *ep); + +int hvnd_connector_reject(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 connector_handle, + u32 priv_data_length, u8 *priv_data, + enum ibv_qp_state *qp_state); + +int hvnd_connector_get_rd_limits(struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx, + u64 connector_handle, + struct nd_read_limits *rd_limits); + +int hvnd_connector_get_priv_data(struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx, + u64 connector_handle, + u8 *priv_data); + +int hvnd_connector_get_peer_addr(struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx, + u64 connector_handle, + union nd_sockaddr_inet *peer_addr); + +int hvnd_connector_get_local_addr(struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx, + u64 connector_handle, + union nd_sockaddr_inet *local_addr); + +int hvnd_connector_notify_disconnect(struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx, + u64 connector_handle, struct hvnd_ep_obj *ep); + + +int hvnd_connector_disconnect(struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx, + u64 connector_handle, struct hvnd_ep_obj *ep); + +int hvnd_free_handle(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 handle, u32 ioctl); + +int hvnd_cancel_io(struct hvnd_ep_obj *ep_object); + +char *hvnd_get_op_name(int ioctl); + +void hvnd_acquire_uctx_ref(struct hvnd_ucontext *uctx); +void hvnd_drop_uctx_ref(struct hvnd_dev *nd_dev,struct hvnd_ucontext *uctx); +void hvnd_process_events(struct work_struct *work); + +void hvnd_process_cq_event_pending(struct hvnd_ep_obj *ep, int status); +void hvnd_process_cq_event_complete(struct hvnd_ep_obj *ep, int status); +void hvnd_process_connector_accept(struct hvnd_ep_obj *ep_object, int status); +void hvnd_process_notify_disconnect(struct hvnd_ep_obj *ep_object, int status); +void hvnd_process_disconnect(struct hvnd_ep_obj *ep_object, int status); + +void put_irp_handle(struct hvnd_dev *nd_dev, u32 irp); +int get_irp_handle(struct hvnd_dev *nd_dev, u32 *local, void *irp_ctx); + +void hvnd_init_hdr(struct ndv_packet_hdr_control_1 *hdr, + u32 data_sz, u32 local, u32 remote, + u32 ioctl_code, + u32 ext_data_sz, u32 ext_data_offset, + u64 irp_handle); + +int hvnd_send_ioctl_pkt(struct hvnd_dev *nd_dev, + struct ndv_packet_hdr_control_1 *hdr, + u32 pkt_size, u64 cookie); + +int hvnd_get_outgoing_rdma_addr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + union nd_sockaddr_inet *og_addr); + +int hvnd_get_neigh_mac_addr(struct sockaddr *local, struct sockaddr *remote, char *mac_addr); + +void hvnd_addr_init(void); + +void hvnd_addr_deinit(void); + +bool ep_add_work_pending(struct hvnd_ep_obj *ep_object); +void ep_del_work_pending(struct hvnd_ep_obj *ep_object); +void ep_stop(struct hvnd_ep_obj *ep_object); + +#define current_pid() (current->pid) +/* + * NT STATUS defines. + */ + +#define STATUS_SUCCESS 0x0 +#define STATUS_PENDING 0x00000103 +#define STATUS_CANCELLED 0xC0000120 +#define STATUS_DISCONNECTED 0xC000020C +#define STATUS_TIMEOUT 0xC00000B5 + +void inc_ioctl_counter_request(unsigned ioctl); +void inc_ioctl_counter_response(unsigned ioctl); + +#define NDV_PROTOCOL_VAERSION_INVALID -1 +#define NDV_PACKET_INIT_SIZE 16 /* Size of the INIT packet */ + +#define HVND_RING_SZ (PAGE_SIZE * 64) + +/* logging levels */ +#define HVND_ERROR 0 +#define HVND_WARN 1 +#define HVND_INFO 2 +#define HVND_DEBUG 3 + +extern int hvnd_log_level; + +#define hvnd_error(fmt, args...) hvnd_log(HVND_ERROR, fmt, ##args) +#define hvnd_warn(fmt, args...) hvnd_log(HVND_WARN, fmt, ##args) +#define hvnd_info(fmt, args...) hvnd_log(HVND_INFO, fmt, ##args) +#define hvnd_debug(fmt, args...) hvnd_log(HVND_DEBUG, fmt, ##args) + +#define hvnd_log(level, fmt, args...) \ +do { \ + if (unlikely(hvnd_log_level >= (level))) \ + printk(KERN_ERR "hvnd %s[%u]: " fmt, __func__, __LINE__, ##args); \ +} while (0) + +#endif /* _VMBUS_RDMA_H */ diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/COPYING b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/COPYING deleted file mode 100644 index f0960c946b07..000000000000 --- a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/COPYING +++ /dev/null @@ -1,339 +0,0 @@ - -"This software program is licensed subject to the GNU General Public License -(GPL). Version 2, June 1991, available at -<http://www.fsf.org/copyleft/gpl.html>" - -GNU General Public License - -Version 2, June 1991 - -Copyright (C) 1989, 1991 Free Software Foundation, Inc. -51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA - -Everyone is permitted to copy and distribute verbatim copies of this license -document, but changing it is not allowed. - -Preamble - -The licenses for most software are designed to take away your freedom to -share and change it. By contrast, the GNU General Public License is intended -to guarantee your freedom to share and change free software--to make sure -the software is free for all its users. This General Public License applies -to most of the Free Software Foundation's software and to any other program -whose authors commit to using it. (Some other Free Software Foundation -software is covered by the GNU Library General Public License instead.) You -can apply it to your programs, too. - -When we speak of free software, we are referring to freedom, not price. Our -General Public Licenses are designed to make sure that you have the freedom -to distribute copies of free software (and charge for this service if you -wish), that you receive source code or can get it if you want it, that you -can change the software or use pieces of it in new free programs; and that -you know you can do these things. - -To protect your rights, we need to make restrictions that forbid anyone to -deny you these rights or to ask you to surrender the rights. These -restrictions translate to certain responsibilities for you if you distribute -copies of the software, or if you modify it. - -For example, if you distribute copies of such a program, whether gratis or -for a fee, you must give the recipients all the rights that you have. You -must make sure that they, too, receive or can get the source code. And you -must show them these terms so they know their rights. - -We protect your rights with two steps: (1) copyright the software, and (2) -offer you this license which gives you legal permission to copy, distribute -and/or modify the software. - -Also, for each author's protection and ours, we want to make certain that -everyone understands that there is no warranty for this free software. If -the software is modified by someone else and passed on, we want its -recipients to know that what they have is not the original, so that any -problems introduced by others will not reflect on the original authors' -reputations. - -Finally, any free program is threatened constantly by software patents. We -wish to avoid the danger that redistributors of a free program will -individually obtain patent licenses, in effect making the program -proprietary. To prevent this, we have made it clear that any patent must be -licensed for everyone's free use or not licensed at all. - -The precise terms and conditions for copying, distribution and modification -follow. - -TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION - -0. This License applies to any program or other work which contains a notice - placed by the copyright holder saying it may be distributed under the - terms of this General Public License. The "Program", below, refers to any - such program or work, and a "work based on the Program" means either the - Program or any derivative work under copyright law: that is to say, a - work containing the Program or a portion of it, either verbatim or with - modifications and/or translated into another language. (Hereinafter, - translation is included without limitation in the term "modification".) - Each licensee is addressed as "you". - - Activities other than copying, distribution and modification are not - covered by this License; they are outside its scope. The act of running - the Program is not restricted, and the output from the Program is covered - only if its contents constitute a work based on the Program (independent - of having been made by running the Program). Whether that is true depends - on what the Program does. - -1. You may copy and distribute verbatim copies of the Program's source code - as you receive it, in any medium, provided that you conspicuously and - appropriately publish on each copy an appropriate copyright notice and - disclaimer of warranty; keep intact all the notices that refer to this - License and to the absence of any warranty; and give any other recipients - of the Program a copy of this License along with the Program. - - You may charge a fee for the physical act of transferring a copy, and you - may at your option offer warranty protection in exchange for a fee. - -2. You may modify your copy or copies of the Program or any portion of it, - thus forming a work based on the Program, and copy and distribute such - modifications or work under the terms of Section 1 above, provided that - you also meet all of these conditions: - - * a) You must cause the modified files to carry prominent notices stating - that you changed the files and the date of any change. - - * b) You must cause any work that you distribute or publish, that in - whole or in part contains or is derived from the Program or any part - thereof, to be licensed as a whole at no charge to all third parties - under the terms of this License. - - * c) If the modified program normally reads commands interactively when - run, you must cause it, when started running for such interactive - use in the most ordinary way, to print or display an announcement - including an appropriate copyright notice and a notice that there is - no warranty (or else, saying that you provide a warranty) and that - users may redistribute the program under these conditions, and - telling the user how to view a copy of this License. (Exception: if - the Program itself is interactive but does not normally print such - an announcement, your work based on the Program is not required to - print an announcement.) - - These requirements apply to the modified work as a whole. If identifiable - sections of that work are not derived from the Program, and can be - reasonably considered independent and separate works in themselves, then - this License, and its terms, do not apply to those sections when you - distribute them as separate works. But when you distribute the same - sections as part of a whole which is a work based on the Program, the - distribution of the whole must be on the terms of this License, whose - permissions for other licensees extend to the entire whole, and thus to - each and every part regardless of who wrote it. - - Thus, it is not the intent of this section to claim rights or contest - your rights to work written entirely by you; rather, the intent is to - exercise the right to control the distribution of derivative or - collective works based on the Program. - - In addition, mere aggregation of another work not based on the Program - with the Program (or with a work based on the Program) on a volume of a - storage or distribution medium does not bring the other work under the - scope of this License. - -3. You may copy and distribute the Program (or a work based on it, under - Section 2) in object code or executable form under the terms of Sections - 1 and 2 above provided that you also do one of the following: - - * a) Accompany it with the complete corresponding machine-readable source - code, which must be distributed under the terms of Sections 1 and 2 - above on a medium customarily used for software interchange; or, - - * b) Accompany it with a written offer, valid for at least three years, - to give any third party, for a charge no more than your cost of - physically performing source distribution, a complete machine- - readable copy of the corresponding source code, to be distributed - under the terms of Sections 1 and 2 above on a medium customarily - used for software interchange; or, - - * c) Accompany it with the information you received as to the offer to - distribute corresponding source code. (This alternative is allowed - only for noncommercial distribution and only if you received the - program in object code or executable form with such an offer, in - accord with Subsection b above.) - - The source code for a work means the preferred form of the work for - making modifications to it. For an executable work, complete source code - means all the source code for all modules it contains, plus any - associated interface definition files, plus the scripts used to control - compilation and installation of the executable. However, as a special - exception, the source code distributed need not include anything that is - normally distributed (in either source or binary form) with the major - components (compiler, kernel, and so on) of the operating system on which - the executable runs, unless that component itself accompanies the - executable. - - If distribution of executable or object code is made by offering access - to copy from a designated place, then offering equivalent access to copy - the source code from the same place counts as distribution of the source - code, even though third parties are not compelled to copy the source - along with the object code. - -4. You may not copy, modify, sublicense, or distribute the Program except as - expressly provided under this License. Any attempt otherwise to copy, - modify, sublicense or distribute the Program is void, and will - automatically terminate your rights under this License. However, parties - who have received copies, or rights, from you under this License will not - have their licenses terminated so long as such parties remain in full - compliance. - -5. You are not required to accept this License, since you have not signed - it. However, nothing else grants you permission to modify or distribute - the Program or its derivative works. These actions are prohibited by law - if you do not accept this License. Therefore, by modifying or - distributing the Program (or any work based on the Program), you - indicate your acceptance of this License to do so, and all its terms and - conditions for copying, distributing or modifying the Program or works - based on it. - -6. Each time you redistribute the Program (or any work based on the - Program), the recipient automatically receives a license from the - original licensor to copy, distribute or modify the Program subject to - these terms and conditions. You may not impose any further restrictions - on the recipients' exercise of the rights granted herein. You are not - responsible for enforcing compliance by third parties to this License. - -7. If, as a consequence of a court judgment or allegation of patent - infringement or for any other reason (not limited to patent issues), - conditions are imposed on you (whether by court order, agreement or - otherwise) that contradict the conditions of this License, they do not - excuse you from the conditions of this License. If you cannot distribute - so as to satisfy simultaneously your obligations under this License and - any other pertinent obligations, then as a consequence you may not - distribute the Program at all. For example, if a patent license would - not permit royalty-free redistribution of the Program by all those who - receive copies directly or indirectly through you, then the only way you - could satisfy both it and this License would be to refrain entirely from - distribution of the Program. - - If any portion of this section is held invalid or unenforceable under any - particular circumstance, the balance of the section is intended to apply - and the section as a whole is intended to apply in other circumstances. - - It is not the purpose of this section to induce you to infringe any - patents or other property right claims or to contest validity of any - such claims; this section has the sole purpose of protecting the - integrity of the free software distribution system, which is implemented - by public license practices. Many people have made generous contributions - to the wide range of software distributed through that system in - reliance on consistent application of that system; it is up to the - author/donor to decide if he or she is willing to distribute software - through any other system and a licensee cannot impose that choice. - - This section is intended to make thoroughly clear what is believed to be - a consequence of the rest of this License. - -8. If the distribution and/or use of the Program is restricted in certain - countries either by patents or by copyrighted interfaces, the original - copyright holder who places the Program under this License may add an - explicit geographical distribution limitation excluding those countries, - so that distribution is permitted only in or among countries not thus - excluded. In such case, this License incorporates the limitation as if - written in the body of this License. - -9. The Free Software Foundation may publish revised and/or new versions of - the General Public License from time to time. Such new versions will be - similar in spirit to the present version, but may differ in detail to - address new problems or concerns. - - Each version is given a distinguishing version number. If the Program - specifies a version number of this License which applies to it and "any - later version", you have the option of following the terms and - conditions either of that version or of any later version published by - the Free Software Foundation. If the Program does not specify a version - number of this License, you may choose any version ever published by the - Free Software Foundation. - -10. If you wish to incorporate parts of the Program into other free programs - whose distribution conditions are different, write to the author to ask - for permission. For software which is copyrighted by the Free Software - Foundation, write to the Free Software Foundation; we sometimes make - exceptions for this. Our decision will be guided by the two goals of - preserving the free status of all derivatives of our free software and - of promoting the sharing and reuse of software generally. - - NO WARRANTY - -11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY - FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN - OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES - PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER - EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED - WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE - ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH - YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL - NECESSARY SERVICING, REPAIR OR CORRECTION. - -12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING - WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR - REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR - DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL - DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM - (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED - INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF - THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR - OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. - -END OF TERMS AND CONDITIONS - -How to Apply These Terms to Your New Programs - -If you develop a new program, and you want it to be of the greatest -possible use to the public, the best way to achieve this is to make it free -software which everyone can redistribute and change under these terms. - -To do so, attach the following notices to the program. It is safest to -attach them to the start of each source file to most effectively convey the -exclusion of warranty; and each file should have at least the "copyright" -line and a pointer to where the full notice is found. - -one line to give the program's name and an idea of what it does. -Copyright (C) yyyy name of author - -This program is free software; you can redistribute it and/or modify it -under the terms of the GNU General Public License as published by the Free -Software Foundation; either version 2 of the License, or (at your option) -any later version. - -This program is distributed in the hope that it will be useful, but WITHOUT -ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or -FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for -more details. - -You should have received a copy of the GNU General Public License along with -this program; if not, write to the Free Software Foundation, Inc., 59 -Temple Place - Suite 330, Boston, MA 02111-1307, USA. - -Also add information on how to contact you by electronic and paper mail. - -If the program is interactive, make it output a short notice like this when -it starts in an interactive mode: - -Gnomovision version 69, Copyright (C) year name of author Gnomovision comes -with ABSOLUTELY NO WARRANTY; for details type 'show w'. This is free -software, and you are welcome to redistribute it under certain conditions; -type 'show c' for details. - -The hypothetical commands 'show w' and 'show c' should show the appropriate -parts of the General Public License. Of course, the commands you use may be -called something other than 'show w' and 'show c'; they could even be -mouse-clicks or menu items--whatever suits your program. - -You should also get your employer (if you work as a programmer) or your -school, if any, to sign a "copyright disclaimer" for the program, if -necessary. Here is a sample; alter the names: - -Yoyodyne, Inc., hereby disclaims all copyright interest in the program -'Gnomovision' (which makes passes at compilers) written by James Hacker. - -signature of Ty Coon, 1 April 1989 -Ty Coon, President of Vice - -This General Public License does not permit incorporating your program into -proprietary programs. If your program is a subroutine library, you may -consider it more useful to permit linking proprietary applications with the -library. If this is what you want to do, use the GNU Library General Public -License instead of this License. diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/Makefile b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/Makefile deleted file mode 100644 index 0373d95341c1..000000000000 --- a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/Makefile +++ /dev/null @@ -1,3 +0,0 @@ -obj-$(CONFIG_HYPERV_INFINIBAND_ND) += hv_network_direct_142.o - -hv_network_direct_142-y := provider.o vmbus_rdma.o hvnd_addr.o diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/hvnd_addr.c b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/hvnd_addr.c deleted file mode 100644 index c8c2295b63af..000000000000 --- a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/hvnd_addr.c +++ /dev/null @@ -1,136 +0,0 @@ -/* - * Copyright (c) 2014, Microsoft Corporation. - * - * Author: - * K. Y. Srinivasan <kys@microsoft.com> - * - * This program is free software; you can redistribute it and/or modify it - * under the terms of the GNU General Public License version 2 as published - * by the Free Software Foundation. - * - * This program is distributed in the hope that it will be useful, but - * WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or - * NON INFRINGEMENT. See the GNU General Public License for more - * details. - * - * Bug fixes/enhancements: Long Li <longli@microsoft.com> - */ - -#include <linux/completion.h> -#include <linux/module.h> -#include <linux/errno.h> -#include <linux/hyperv.h> -#include <linux/efi.h> -#include <linux/slab.h> -#include <linux/cred.h> -#include <linux/sched.h> -#include <linux/types.h> -#include <linux/completion.h> -#include <linux/scatterlist.h> -#include <rdma/ib_umem.h> -#include <rdma/ib_addr.h> - -#include "vmbus_rdma.h" - - -#include <linux/semaphore.h> -#include <linux/fs.h> -#include <linux/nls.h> -#include <linux/workqueue.h> -#include <linux/cdev.h> -#include <linux/hyperv.h> -#include <linux/sched.h> -#include <linux/uaccess.h> -#include <linux/miscdevice.h> -#include <linux/hyperv.h> - - -int hvnd_get_outgoing_rdma_addr(struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx, - union nd_sockaddr_inet *og_addr) -{ - int ret; - /* - * Query the host and select the first address. - */ - struct pkt_query_addr_list pkt; - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - - hvnd_init_hdr(&pkt.hdr, - (sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1)), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST, 0, 0, 0); - - pkt.ioctl.in.version = ND_VERSION_1; - pkt.ioctl.in.reserved = 0; - pkt.ioctl.in.handle = uctx->adaptor_hdl; - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, - sizeof(pkt), (u64)&pkt); - - if (ret) - return ret; - - /* - * Copy the address out. - */ - - memcpy(og_addr, &pkt.ioctl.out[0], sizeof(*og_addr)); - return 0; - -} - -static struct rdma_addr_client self; - -struct resolve_cb_context { - struct rdma_dev_addr *addr; - struct completion comp; -}; - -void hvnd_addr_init(void) -{ - rdma_addr_register_client(&self); -} - -void hvnd_addr_deinit(void) -{ - rdma_addr_unregister_client(&self); -} - -static void resolve_cb(int status, struct sockaddr *src_addr, - struct rdma_dev_addr *addr, void *context) -{ - memcpy(((struct resolve_cb_context *)context)->addr, addr, sizeof(struct - rdma_dev_addr)); - complete(&((struct resolve_cb_context *)context)->comp); -} - -int hvnd_get_neigh_mac_addr(struct sockaddr *local, struct sockaddr *remote, - char *mac_addr) -{ - struct rdma_dev_addr dev_addr; - struct resolve_cb_context ctx; - int ret; - - memset(&dev_addr, 0, sizeof(dev_addr)); - dev_addr.net = &init_net; - ctx.addr = &dev_addr; - init_completion(&ctx.comp); - - ret = rdma_resolve_ip(&self, local, remote, &dev_addr, 1000, - resolve_cb, &ctx); - - if (ret) { - hvnd_error("rdma_resolve_ip failed ret=%d\n", ret); - return ret; - } - - wait_for_completion(&ctx.comp); - memcpy(mac_addr, dev_addr.dst_dev_addr, ETH_ALEN); - return ret; -} diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/mx_abi.h b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/mx_abi.h deleted file mode 100644 index 8f55cb408de0..000000000000 --- a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/mx_abi.h +++ /dev/null @@ -1,232 +0,0 @@ -/* - * Copyright (c) 2005 Topspin Communications. All rights reserved. - * Copyright (c) 2005 Cisco Systems. All rights reserved. - * Copyright (c) 2005 PathScale, Inc. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - * KYS: made some modifications. - */ - -#ifndef MX_ABI_H -#define MX_ABI_H - - - -/* - * Make sure that all structs defined in this file remain laid out so - * that they pack the same way on 32-bit and 64-bit architectures (to - * avoid incompatibility between 32-bit userspace and 64-bit kernels). - * Specifically: - * - Do not use pointer types -- pass pointers in UINT64 instead. - * - Make sure that any structure larger than 4 bytes is padded to a - * multiple of 8 bytes. Otherwise the structure size will be - * different between 32-bit and 64-bit architectures. - */ - -enum ibv_get_context_mappings { - IBV_GET_CONTEXT_UAR, - IBV_GET_CONTEXT_BF, - IBV_GET_CONTEXT_MAPPING_MAX -}; - -struct ibv_get_context_req { - - union nd_mapping mappings[IBV_GET_CONTEXT_MAPPING_MAX]; -}; - -struct ibv_get_context_resp { - - // mmap UAR and BF - struct nd_mapping_result mapping_results[IBV_GET_CONTEXT_MAPPING_MAX]; - - // mmap Blue Flame - int bf_buf_size; - int bf_offset; - - // mlx4_query_device result - int max_qp_wr; - int max_sge; - int max_cqe; - - // general parameters - u32 cqe_size; - u32 vend_id; - u16 dev_id; - u16 bf_reg_size; - u16 bf_regs_per_page; - u16 reserved1; - - // ibv_cmd_get_context result - u32 qp_tab_size; - - u32 reserved2; -}; - -struct ibv_alloc_pd_resp { - u64 pd_handle; - u32 pdn; - u32 reserved; -}; - -struct ibv_reg_mr { - u64 start; - u64 length; - u64 hca_va; - u32 access_flags; - u32 pdn; - u64 pd_handle; -}; - -struct ibv_reg_mr_resp { - u64 mr_handle; - u32 lkey; - u32 rkey; -}; - - -enum mlx4_ib_create_cq_mapping { - MLX4_IB_CREATE_CQ_BUF, - MLX4_IB_CREATE_CQ_DB, - MLX4_IB_CREATE_CQ_ARM_SN, // Windows specific - MLX4_IB_CREATE_CQ_MAPPING_MAX -}; - -#define MLX4_CQ_FLAGS_ARM_IN_KERNEL 1 - -struct ibv_create_cq { - union nd_mapping mappings[MLX4_IB_CREATE_CQ_MAPPING_MAX]; - u32 flags; -}; - -struct ibv_create_cq_resp { - struct nd_mapping_result mapping_results[MLX4_IB_CREATE_CQ_MAPPING_MAX]; - u32 cqn; - u32 cqe; -}; - -enum mlx4_ib_create_srq_mappings { - MLX4_IB_CREATE_SRQ_BUF, - MLX4_IB_CREATE_SRQ_DB, - MLX4_IB_CREATE_SRQ_MAPPINGS_MAX -}; - -struct ibv_create_srq { - union nd_mapping mappings[MLX4_IB_CREATE_SRQ_MAPPINGS_MAX]; -}; - -struct ibv_create_srq_resp { - struct nd_mapping_result mapping_results[MLX4_IB_CREATE_SRQ_MAPPINGS_MAX]; -}; - -enum mlx4_ib_create_qp_mappings { - MLX4_IB_CREATE_QP_BUF, - MLX4_IB_CREATE_QP_DB, - MLX4_IB_CREATE_QP_MAPPINGS_MAX -}; - -struct ibv_create_qp { - union nd_mapping mappings[MLX4_IB_CREATE_QP_MAPPINGS_MAX]; - u8 log_sq_bb_count; - u8 log_sq_stride; - u8 sq_no_prefetch; - u8 reserved; -}; - -struct ibv_create_qp_resp { - struct nd_mapping_result mapping_results[MLX4_IB_CREATE_QP_MAPPINGS_MAX]; - // struct ib_uverbs_create_qp_resp - u64 qp_handle; - u32 qpn; - u32 max_send_wr; - u32 max_recv_wr; - u32 max_send_sge; - u32 max_recv_sge; - u32 max_inline_data; -}; - -enum ibv_qp_attr_mask { - IBV_QP_STATE = 1 << 0, - IBV_QP_CUR_STATE = 1 << 1, - IBV_QP_EN_SQD_ASYNC_NOTIFY = 1 << 2, - IBV_QP_ACCESS_FLAGS = 1 << 3, - IBV_QP_PKEY_INDEX = 1 << 4, - IBV_QP_PORT = 1 << 5, - IBV_QP_QKEY = 1 << 6, - IBV_QP_AV = 1 << 7, - IBV_QP_PATH_MTU = 1 << 8, - IBV_QP_TIMEOUT = 1 << 9, - IBV_QP_RETRY_CNT = 1 << 10, - IBV_QP_RNR_RETRY = 1 << 11, - IBV_QP_RQ_PSN = 1 << 12, - IBV_QP_MAX_QP_RD_ATOMIC = 1 << 13, - IBV_QP_ALT_PATH = 1 << 14, - IBV_QP_MIN_RNR_TIMER = 1 << 15, - IBV_QP_SQ_PSN = 1 << 16, - IBV_QP_MAX_DEST_RD_ATOMIC = 1 << 17, - IBV_QP_PATH_MIG_STATE = 1 << 18, - IBV_QP_CAP = 1 << 19, - IBV_QP_DEST_QPN = 1 << 20 -}; - -enum ibv_qp_state { - IBV_QPS_RESET, - IBV_QPS_INIT, - IBV_QPS_RTR, - IBV_QPS_RTS, - IBV_QPS_SQD, - IBV_QPS_SQE, - IBV_QPS_ERR -}; - - -struct ibv_modify_qp_resp { - enum ibv_qp_attr_mask attr_mask; - u8 qp_state; - u8 reserved[3]; -}; - -struct ibv_create_ah_resp { - u64 start; -}; - -/* - * Some mlx4 specific kernel definitions. Perhaps could be in - * separate file. - */ - -struct mlx4_ib_user_db_page { - struct list_head list; - struct ib_umem *umem; - unsigned long user_virt; - int refcnt; -}; - - -#endif /* MX_ABI_H */ diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/provider.c b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/provider.c deleted file mode 100644 index 7115cf769fe0..000000000000 --- a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/provider.c +++ /dev/null @@ -1,2961 +0,0 @@ -/* - * Copyright (c) 2014, Microsoft Corporation. - * - * Author: - * K. Y. Srinivasan <kys@microsoft.com> - * - * This program is free software; you can redistribute it and/or modify it - * under the terms of the GNU General Public License version 2 as published - * by the Free Software Foundation. - * - * This program is distributed in the hope that it will be useful, but - * WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or - * NON INFRINGEMENT. See the GNU General Public License for more - * details. - * - * Bug fixes/enhancements: Long Li <longli@microsoft.com> - */ - -#include <linux/module.h> -#include <linux/moduleparam.h> -#include <linux/device.h> -#include <linux/netdevice.h> -#include <linux/etherdevice.h> -#include <linux/delay.h> -#include <linux/errno.h> -#include <linux/list.h> -#include <linux/spinlock.h> -#include <linux/ethtool.h> -#include <linux/rtnetlink.h> -#include <linux/inetdevice.h> -#include <linux/io.h> -#include <linux/hyperv.h> -#include <linux/completion.h> -#include <rdma/iw_cm.h> -#include <rdma/ib_verbs.h> -#include <rdma/ib_smi.h> -#include <rdma/ib_umem.h> -#include <rdma/ib_user_verbs.h> -#include <rdma/rdma_cm.h> -#include <linux/proc_fs.h> -#include <linux/seq_file.h> -#include "vmbus_rdma.h" - -/* - * We are emulating mlx4. XXXKYS: May have to FIX. - */ -#include "./user.h" - -static struct hvnd_dev *g_nd_dev; // the one and only one - -int hvnd_log_level = HVND_ERROR; -module_param(hvnd_log_level, int, 0644); -MODULE_PARM_DESC(hvnd_log_level, - "Logging level, 0 - Error (default), 1 - Warning, 2 - Info, 3 - Debug."); - -static int disable_cq_notify = 1; -//static int disable_cq_notify = 0; -module_param(disable_cq_notify, int, 0644); -MODULE_PARM_DESC(disable_cq_notify, - "Disable CQ notification, 0 - Enable, 1 - Disable (default)."); - -enum { - MLX4_USER_DEV_CAP_64B_CQE = 1L << 0 -}; - -#define HVND_NODE_DESC "vmbus-RDMA" - -#undef MLX4_IB_UVERBS_ABI_VERSION -#define MLX4_IB_UVERBS_ABI_VERSION 4 - -struct mlx4_wqe_data_seg { - __be32 byte_count; - __be32 lkey; - __be64 addr; -}; - -/* return value: - * true: ep is running - * false: ep is stopped - */ -bool ep_add_work_pending(struct hvnd_ep_obj *ep_object) -{ - bool ret = true; - - atomic_inc(&ep_object->nr_requests_pending); - if (ep_object->stopping) { - if (atomic_dec_and_test(&ep_object->nr_requests_pending)) - wake_up(&ep_object->wait_pending); - ret = false; - } - return ret; -} - -void ep_del_work_pending(struct hvnd_ep_obj *ep_object) -{ - if (atomic_dec_and_test(&ep_object->nr_requests_pending)) - wake_up(&ep_object->wait_pending); - - if (atomic_read(&ep_object->nr_requests_pending) < 0) { - hvnd_error("ep_object->nr_requests_pending=%d type=%d cm_state=%d\n", - atomic_read(&ep_object->nr_requests_pending), - ep_object->type, ep_object->cm_state); - dump_stack(); - } -} - -void ep_stop(struct hvnd_ep_obj *ep_object) -{ - if (!ep_object->stopping) { - ep_object->stopping = true; - hvnd_cancel_io(ep_object); - } - - if (atomic_read(&ep_object->nr_requests_pending) < 0) { - hvnd_error("IO canceled, ep_object->nr_requests_pending=%d type=%d cm_state=%d\n", - atomic_read(&ep_object->nr_requests_pending), - ep_object->type, ep_object->cm_state); - dump_stack(); - } - - wait_event(ep_object->wait_pending, - !atomic_read(&ep_object->nr_requests_pending)); -} - -static int vmbus_dma_map_sg(struct device *dev, struct scatterlist *sgl, - int nents, enum dma_data_direction direction, - unsigned long attrs) -{ - struct scatterlist *sg; - u64 addr; - int i; - int ret = nents; - - BUG_ON(!valid_dma_direction(direction)); - - for_each_sg(sgl, sg, nents, i) { - addr = (u64) page_address(sg_page(sg)); - /* TODO: handle highmem pages */ - if (!addr) { - ret = 0; - break; - } - sg->dma_address = addr + sg->offset; - sg->dma_length = sg->length; - } - return ret; -} - -static void vmbus_dma_unmap_sg(struct device *dev, - struct scatterlist *sg, int nents, - enum dma_data_direction direction, unsigned long attrs) -{ - BUG_ON(!valid_dma_direction(direction)); -} - - -const struct dma_map_ops vmbus_dma_ops = { - .map_sg = vmbus_dma_map_sg, - .unmap_sg = vmbus_dma_unmap_sg, -}; - -static int hvnd_get_incoming_connections(struct hvnd_ep_obj *listener, - struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx); - -static struct hvnd_ep_obj *hvnd_setup_ep(struct iw_cm_id *cm_id, int ep_type, - struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx); - -static void hvnd_deinit_ep(struct hvnd_ep_obj *ep) -{ - put_irp_handle(ep->nd_dev, ep->local_irp); -} - -static void hvnd_destroy_ep(struct hvnd_ep_obj *ep) -{ - hvnd_debug("canceling work for ep %p\n", ep); - cancel_work_sync(&ep->wrk.work); - hvnd_deinit_ep(ep); - kfree(ep); -} - - -#define UC(b) (((int)b)&0xff) -char *debug_inet_ntoa(struct in_addr in, char *b) -{ - register char *p; - - p = (char *)∈ - (void)snprintf(b, 20, - "%d.%d.%d.%d", UC(p[0]), UC(p[1]), UC(p[2]), UC(p[3])); - return b; -} - -void hvnd_process_events(struct work_struct *work); - -static int hvnd_init_ep(struct hvnd_ep_obj *ep_object, - struct iw_cm_id *cm_id, int ep_type, - struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx) -{ - int ret; - - ep_object->type = ep_type; - ep_object->cm_id = cm_id; - ep_object->nd_dev = nd_dev; - ep_object->uctx = uctx; - - ep_object->parent = NULL; - - ep_object->wrk.callback_arg = ep_object; - INIT_WORK(&ep_object->wrk.work, hvnd_process_events); - INIT_LIST_HEAD(&ep_object->incoming_pkt_list); - spin_lock_init(&ep_object->incoming_pkt_list_lock); - - ep_object->stopping = false; - atomic_set(&ep_object->nr_requests_pending, 0); - init_waitqueue_head(&ep_object->wait_pending); - - ret = get_irp_handle(nd_dev, &ep_object->local_irp, (void *)ep_object); - - if (ret) { - hvnd_error("get_irp_handle() failed: err: %d\n", ret); - return ret; - } - return 0; - -} - -static int set_rq_size(struct hvnd_dev *dev, struct ib_qp_cap *cap, - struct hvnd_qp *qp) -{ - - /* HW requires >= 1 RQ entry with >= 1 gather entry */ - if (!cap->max_recv_wr || !cap->max_recv_sge) - return -EINVAL; - - qp->rq_wqe_cnt = roundup_pow_of_two(max(1U, cap->max_recv_wr)); - qp->rq_max_gs = roundup_pow_of_two(max(1U, cap->max_recv_sge)); - qp->rq_wqe_shift = ilog2( - qp->rq_max_gs * sizeof(struct mlx4_wqe_data_seg)); - - - return 0; -} - -static int set_user_sq_size(struct hvnd_dev *dev, - struct hvnd_qp *qp, - struct mlx4_ib_create_qp *ucmd) -{ - qp->sq_wqe_cnt = 1 << ucmd->log_sq_bb_count; - qp->sq_wqe_shift = ucmd->log_sq_stride; - - qp->buf_size = (qp->rq_wqe_cnt << qp->rq_wqe_shift) + - (qp->sq_wqe_cnt << qp->sq_wqe_shift); - - return 0; -} - -static int hvnd_db_map_user(struct hvnd_ucontext *uctx, unsigned long virt, - struct ib_umem **db_umem) -{ - struct mlx4_ib_user_db_page *page; - int err = 0; - - mutex_lock(&uctx->db_page_mutex); - - list_for_each_entry(page, &uctx->db_page_list, list) - if (page->user_virt == (virt & PAGE_MASK)) - goto found; - - page = kmalloc(sizeof(*page), GFP_KERNEL); - if (!page) { - err = -ENOMEM; - goto out; - } - - page->user_virt = (virt & PAGE_MASK); - page->refcnt = 0; - page->umem = ib_umem_get(&uctx->ibucontext, virt & PAGE_MASK, - PAGE_SIZE, 0, 0); - if (IS_ERR(page->umem)) { - hvnd_error("ib_umem_get failure\n"); - err = PTR_ERR(page->umem); - kfree(page); - goto out; - } - - list_add(&page->list, &uctx->db_page_list); - -found: - ++page->refcnt; -out: - mutex_unlock(&uctx->db_page_mutex); - if (!err) - *db_umem = page->umem; - - return err; -} - -static void hvnd_db_unmap_user(struct hvnd_ucontext *uctx, u64 db_addr) -{ - struct mlx4_ib_user_db_page *page; - - mutex_lock(&uctx->db_page_mutex); - list_for_each_entry(page, &uctx->db_page_list, list) - if (page->user_virt == (db_addr & PAGE_MASK)) - goto found; - -found: - if (!--page->refcnt) { - list_del(&page->list); - ib_umem_release(page->umem); - kfree(page); - } - - mutex_unlock(&uctx->db_page_mutex); -} - - -static void debug_check(const char *func, int line) -{ - hvnd_debug("func is: %s; line is %d\n", func, line); - - if (in_interrupt()) { - hvnd_error("In interrupt func is: %s; line is %d\n", - func, line); - return; - } -} - -static struct ib_ah *hvnd_ah_create(struct ib_pd *pd, - struct ib_ah_attr *ah_attr, - struct ib_udata *udata) -{ - debug_check(__func__, __LINE__); - return ERR_PTR(-ENOSYS); -} - -static int hvnd_ah_destroy(struct ib_ah *ah) -{ - debug_check(__func__, __LINE__); - return -ENOSYS; -} - -static int hvnd_multicast_attach(struct ib_qp *ibqp, union ib_gid *gid, u16 lid) -{ - debug_check(__func__, __LINE__); - return -ENOSYS; -} - -static int hvnd_multicast_detach(struct ib_qp *ibqp, union ib_gid *gid, u16 lid) -{ - debug_check(__func__, __LINE__); - return -ENOSYS; -} - -static int hvnd_process_mad(struct ib_device *ibdev, - int mad_flags, - u8 port_num, - const struct ib_wc *in_wc, - const struct ib_grh *in_grh, - const struct ib_mad_hdr *in_mad, - size_t in_mad_size, - struct ib_mad_hdr *out_mad, - size_t *out_mad_size, - u16 *out_mad_pkey_index) -{ - debug_check(__func__, __LINE__); - return -ENOSYS; -} - -void hvnd_acquire_uctx_ref(struct hvnd_ucontext *uctx) -{ - atomic_inc(&uctx->refcnt); -} - -void hvnd_drop_uctx_ref(struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx) -{ - if (atomic_dec_and_test(&uctx->refcnt)) { - hvnd_debug("uctx ref cnt dropped it is %d\n", - atomic_read(&uctx->refcnt)); - hvnd_debug("About to close adaptor\n"); - hvnd_close_adaptor(nd_dev, uctx); - } else - hvnd_debug("uctx ref cnt dropped it is %d\n", - atomic_read(&uctx->refcnt)); -} - - -static int hvnd_dealloc_ucontext(struct ib_ucontext *context) -{ - struct hvnd_dev *nd_dev; - struct hvnd_ucontext *uctx; - - uctx = to_nd_context(context); - nd_dev = to_nd_dev(context->device); - - hvnd_debug("calling %s\n", __func__); - - hvnd_drop_uctx_ref(nd_dev, uctx); - - return 0; -} - -static struct ib_ucontext *hvnd_alloc_ucontext(struct ib_device *ibdev, - struct ib_udata *udata) -{ - struct hvnd_dev *nd_dev = to_nd_dev(ibdev); - struct hvnd_ucontext *uctx; - struct mlx4_ib_alloc_ucontext_resp resp; - struct ibv_get_context_resp *ctx_output; - int ret; - - if (!nd_dev->ib_active) { - hvnd_error("ib device is not active, try again\n"); - return ERR_PTR(-EAGAIN); - } - - uctx = get_uctx(nd_dev, current_pid()); - if (uctx) { - // it is already opened, just increase its reference count - hvnd_acquire_uctx_ref(uctx); - } else { - - /* - * The Windows host expects the following to be done: - * 1. Successfully send struct ndv_pkt_hdr_create_1 - * 2. INIT PROVIDER - * 3. Open Adapter - * Before we can complete this call. - */ - - uctx = kzalloc(sizeof(struct hvnd_ucontext), GFP_KERNEL); - if (!uctx) - return ERR_PTR(-ENOMEM); - - atomic_set(&uctx->refcnt, 1); - INIT_LIST_HEAD(&uctx->db_page_list); - mutex_init(&uctx->db_page_mutex); - - /* - * Stash away the context with the calling PID. - */ - ret = insert_handle(nd_dev, &nd_dev->uctxidr, uctx, - current_pid()); - if (ret) { - hvnd_error("Uctx ID insertion failed; ret is %d\n", - ret); - goto err1; - } - - hvnd_debug("Opening adaptor pid is %d\n", current_pid()); - - ret = hvnd_open_adaptor(nd_dev, uctx); - if (ret) { - hvnd_error("hvnd_open_adaptor failed ret=%d\n", ret); - goto err1; - } - - } - - /* - * Copy the response out. - */ - ctx_output = &uctx->o_adap_pkt.mappings.ctx_output; - resp.dev_caps = MLX4_USER_DEV_CAP_64B_CQE; - resp.qp_tab_size = ctx_output->qp_tab_size; - resp.bf_reg_size = ctx_output->bf_reg_size; - resp.bf_regs_per_page = ctx_output->bf_regs_per_page; - resp.cqe_size = ctx_output->cqe_size; - - ret = ib_copy_to_udata(udata, &resp, sizeof(resp)); - if (ret) { - hvnd_error("ib_copy_to_udata failed ret=%d\n", ret); - goto err1; - } - - return &uctx->ibucontext; - -err1: - kfree(uctx); - return ERR_PTR(ret); -} - -static int hvnd_mmap(struct ib_ucontext *context, struct vm_area_struct *vma) -{ - struct hvnd_ucontext *uctx = to_nd_context(context); - - if (vma->vm_end - vma->vm_start != PAGE_SIZE) { - hvnd_error("vma not a page size, actual size=%lu\n", - vma->vm_end - vma->vm_start); - return -EINVAL; - } - - if (vma->vm_pgoff == 0) { - vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); - - if (io_remap_pfn_range(vma, vma->vm_start, - (uctx->uar_base >> PAGE_SHIFT), - PAGE_SIZE, vma->vm_page_prot)) { - hvnd_error("io_remap_pfn_range failure\n"); - return -EAGAIN; - } - } else if (vma->vm_pgoff == 1 && uctx->bf_buf_size != 0) { - vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot); - - if (io_remap_pfn_range(vma, vma->vm_start, - (uctx->uar_base >> PAGE_SHIFT) + 1, - PAGE_SIZE, vma->vm_page_prot)) { - hvnd_error("io_remap_pfn_range failure\n"); - return -EAGAIN; - } - } else { - hvnd_error("check code\n"); - return -EINVAL; - } - - return 0; -} - -static int hvnd_deallocate_pd(struct ib_pd *pd) -{ - - struct hvnd_ucontext *uctx; - struct hvnd_dev *nd_dev; - struct hvnd_ib_pd *hvnd_pd; - struct ib_ucontext *ibuctx = pd->uobject->context; - - hvnd_pd = to_nd_pd(pd); - nd_dev = to_nd_dev(pd->device); - uctx = to_nd_context(ibuctx); - - hvnd_free_handle(nd_dev, uctx, hvnd_pd->handle, - IOCTL_ND_PD_FREE); - - hvnd_drop_uctx_ref(nd_dev, uctx); - return 0; -} - -static struct ib_pd *hvnd_allocate_pd(struct ib_device *ibdev, - struct ib_ucontext *context, - struct ib_udata *udata) -{ - struct hvnd_ucontext *uctx; - struct hvnd_dev *nd_dev; - int ret; - struct hvnd_ib_pd *hvnd_pd; - - if (!context) { - hvnd_error("kernel mode context not supported\n"); - return ERR_PTR(-EINVAL); - } - - hvnd_pd = kzalloc(sizeof(struct hvnd_ib_pd), GFP_KERNEL); - - if (!hvnd_pd) - return ERR_PTR(-ENOMEM); - - uctx = to_nd_context(context); - nd_dev = to_nd_dev(ibdev); - - ret = hvnd_create_pd(uctx, nd_dev, hvnd_pd); - if (ret) { - hvnd_error("hvnd_create_pd failure ret=%d\n", ret); - goto error_cr_pd; - } - - if (context) { - if (ib_copy_to_udata(udata, &hvnd_pd->pdn, sizeof(__u32))) { - hvnd_error("ib_copy_to_udata failure\n"); - ret = -EFAULT; - goto error_fault; - } - } - - hvnd_acquire_uctx_ref(uctx); - return &hvnd_pd->ibpd; - -error_fault: - hvnd_free_handle(nd_dev, uctx, hvnd_pd->handle, - IOCTL_ND_PD_FREE); - -error_cr_pd: - kfree(hvnd_pd); - return ERR_PTR(ret); -} - -static int hvnd_query_pkey(struct ib_device *ibdev, u8 port, u16 index, - u16 *pkey) -{ - debug_check(__func__, __LINE__); - *pkey = 0; - return 0; -} - -static int hvnd_query_gid(struct ib_device *ibdev, u8 port, int index, - union ib_gid *gid) -{ - int ret; - struct hvnd_dev *nd_dev = to_nd_dev(ibdev); - - debug_check(__func__, __LINE__); - - ret = wait_for_completion_timeout(&nd_dev->addr_set, 60*HZ); - if (!ret) - return -ETIMEDOUT; - - memset(&(gid->raw[0]), 0, sizeof(gid->raw)); - memcpy(&(gid->raw[0]), nd_dev->mac_addr, 6); - return 0; -} - -static int hvnd_query_device(struct ib_device *ibdev, - struct ib_device_attr *props, - struct ib_udata *udata) -{ - struct hvnd_dev *nd_dev = to_nd_dev(ibdev); - struct adapter_info_v2 *adap_info; - - if (!nd_dev->query_pkt_set) { - hvnd_error("query packet not received yet\n"); - return 0; - } - - adap_info = &nd_dev->query_pkt.ioctl.ad_info; - - memset(props, 0, sizeof(*props)); - - /* - * Copy the relevant properties out. - */ - props->fw_ver = 0; - props->device_cap_flags = 0; - //props->device_cap_flags |= IB_DEVICE_BAD_PKEY_CNTR; - //props->device_cap_flags |= IB_DEVICE_BAD_QKEY_CNTR; - //props->device_cap_flags |= IB_DEVICE_XRC; - - props->vendor_id = 0x15b3; - props->vendor_part_id = adap_info->device_id; - - props->max_mr_size = ~0ull; - props->page_size_cap = PAGE_SIZE; - props->max_qp = 16384; - props->max_qp_wr = min(adap_info->max_recv_q_depth, - adap_info->max_initiator_q_depth); - - props->max_sge = min(adap_info->max_initiator_sge, - adap_info->max_recv_sge); - props->max_cq = 0x1FFFF; - props->max_cqe = adap_info->max_completion_q_depth; - props->max_mr = 16384; - props->max_pd = 16384; - - props->max_qp_rd_atom = adap_info->max_inbound_read_limit; - props->max_qp_init_rd_atom = adap_info->max_outbound_read_limit; - props->max_res_rd_atom = props->max_qp_rd_atom * props->max_qp; - props->max_srq = 16384; - props->max_srq_wr = adap_info->max_recv_q_depth; - props->max_srq_sge = adap_info->max_recv_sge; - - return 0; -} - -static int hvnd_query_port(struct ib_device *ibdev, u8 port, - struct ib_port_attr *props) -{ - memset(props, 0, sizeof(struct ib_port_attr)); - - props->max_mtu = IB_MTU_4096; - props->active_mtu = IB_MTU_4096; - - /* - * KYS: TBD need to base this on netdev. - */ - props->state = IB_PORT_ACTIVE; - - props->port_cap_flags = IB_PORT_CM_SUP; - - props->gid_tbl_len = 1; - props->pkey_tbl_len = 1; - props->active_width = 1; - props->active_speed = IB_SPEED_DDR; //KYS: check - props->max_msg_sz = -1; - - return 0; -} - -static enum rdma_link_layer -hvnd_get_link_layer(struct ib_device *device, u8 port) -{ - return IB_LINK_LAYER_ETHERNET; -} - -static ssize_t hvnd_show_rev(struct device *dev, struct device_attribute *attr, - char *buf) -{ - return 0; -} - -static ssize_t hvnd_show_fw_ver(struct device *dev, - struct device_attribute *attr, char *buf) -{ - return 0; -} - -static ssize_t hvnd_show_hca(struct device *dev, struct device_attribute *attr, - char *buf) -{ - return 0; -} - -static ssize_t hvnd_show_board(struct device *dev, - struct device_attribute *attr, char *buf) -{ - return 0; -} - -static int hvnd_get_port_immutable(struct ib_device *ibdev, u8 port_num, - struct ib_port_immutable *immutable) -{ - struct ib_port_attr attr; - int err; - - err = hvnd_query_port(ibdev, port_num, &attr); - if (err) - return err; - - immutable->pkey_tbl_len = attr.pkey_tbl_len; - immutable->gid_tbl_len = attr.gid_tbl_len; - immutable->core_cap_flags = RDMA_CORE_PORT_IWARP; - - return 0; -} - -static struct ib_qp *hvnd_ib_create_qp(struct ib_pd *pd, - struct ib_qp_init_attr *attrs, - struct ib_udata *udata) -{ - struct hvnd_ucontext *uctx; - struct hvnd_dev *nd_dev; - struct mlx4_ib_create_qp ucmd; - struct hvnd_qp *qp; - int ret = 0; - struct hvnd_ib_pd *hvnd_pd = to_nd_pd(pd); - struct hvnd_cq *send_cq = to_nd_cq(attrs->send_cq); - struct hvnd_cq *recv_cq = to_nd_cq(attrs->recv_cq); - - uctx = get_uctx_from_pd(pd); - nd_dev = to_nd_dev(pd->device); - - if (attrs->qp_type != IB_QPT_RC) { - hvnd_error("attrs->qp_type=%d not IB_QPT_RC\n", attrs->qp_type); - return ERR_PTR(-EINVAL); - } - - qp = kzalloc(sizeof(*qp), GFP_KERNEL); - if (!qp) { - ret = -ENOMEM; - goto err_done; - } - - qp->uctx = uctx; - - if (ib_copy_from_udata(&ucmd, udata, sizeof(ucmd))) { - hvnd_error("ib_copy_from_udata failed\n"); - ret = -EFAULT; - goto err_ucpy; - } - - qp->qp_buf = (void *)ucmd.buf_addr; - qp->db_addr = (void *)ucmd.db_addr; - qp->log_sq_bb_count = ucmd.log_sq_bb_count; - qp->log_sq_stride = ucmd.log_sq_stride; - qp->sq_no_prefetch = ucmd.sq_no_prefetch; - qp->port = attrs->port_num; - - init_waitqueue_head(&qp->wait); - atomic_set(&qp->refcnt, 1); - - qp->recv_cq = recv_cq; - qp->send_cq = send_cq; - qp->nd_dev = nd_dev; - - qp->receive_cq_handle = recv_cq->cq_handle; - qp->initiator_cq_handle = send_cq->cq_handle; - qp->pd_handle = hvnd_pd->handle; - qp->cq_notify = false; - - qp->ibqp.qp_num = attrs->qp_type == IB_QPT_SMI ? 0 : 1; - - qp->max_inline_data = attrs->cap.max_inline_data; - - qp->initiator_q_depth = attrs->cap.max_send_wr; - qp->initiator_request_sge = attrs->cap.max_send_sge; - - - qp->receive_q_depth = attrs->cap.max_recv_wr; - qp->receive_request_sge = attrs->cap.max_recv_sge; - - set_rq_size(nd_dev, &attrs->cap, qp); - - set_user_sq_size(nd_dev, qp, &ucmd); - - qp->umem = ib_umem_get(&uctx->ibucontext, ucmd.buf_addr, - qp->buf_size, 0, 0); - if (IS_ERR(qp->umem)) { - ret = PTR_ERR(qp->umem); - hvnd_error("ib_umem_get failed ret=%d\n", ret); - goto err_ucpy; - } - - ret = hvnd_db_map_user(uctx, ucmd.db_addr, &qp->db_umem); - - if (ret) { - hvnd_error("hvnd_db_map_user failed ret=%d\n", ret); - goto err_db_map; - } - - ret = hvnd_create_qp(nd_dev, uctx, qp); - - if (ret) { - hvnd_error("hvnd_create_qp failed ret=%d\n", ret); - goto err_qp; - } - - hvnd_acquire_uctx_ref(uctx); - - qp->ibqp.qp_num = qp->qpn; - qp->ibqp.qp_type = IB_QPT_RC; - - - return &qp->ibqp; - -err_qp: - hvnd_db_unmap_user(uctx, ucmd.db_addr); - -err_db_map: - ib_umem_release(qp->umem); - -err_ucpy: - kfree(qp); -err_done: - return ERR_PTR(ret); -} - -static int hvnd_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, - int attr_mask, struct ib_udata *udata) -{ - struct hvnd_qp *qp = to_nd_qp(ibqp); - struct hvnd_dev *nd_dev = to_nd_dev(ibqp->device); - enum ib_qp_state cur_state, new_state; - int ret = 0; - - if (attr != NULL) { - cur_state = (attr_mask & IB_QP_CUR_STATE) - ? attr->cur_qp_state : qp->qp_state; - new_state = (attr_mask & IB_QP_STATE) - ? attr->qp_state : cur_state; - - hvnd_debug("qp->qp_state is %d new state is %d\n", - qp->qp_state, new_state); - hvnd_debug("current qp state is %d\n", cur_state); - if (attr_mask & IB_QP_STATE) { - /* Ensure the state is valid */ - if (attr->qp_state < 0 || attr->qp_state > IB_QPS_ERR) { - hvnd_error("incorrect qp state attr->qp_state=%d\n", - attr->qp_state); - return -EINVAL; - } - - if (qp->qp_state != new_state) { - qp->qp_state = new_state; - /* - * The only state transition supported is the - * transition to error state. - */ - switch (new_state) { - case IB_QPS_ERR: - case IB_QPS_SQD: - ret = hvnd_flush_qp(nd_dev, qp->uctx, - qp); - - if (ret) - hvnd_error("hvnd_flush_qp failed ret=%d\n", ret); - - // immediately notify the upper layer - // on disconnection - if (!ret && qp->connector) - hvnd_process_notify_disconnect( - qp->connector, - STATUS_SUCCESS); - - return ret; - - default: - break; - } - } - } - } - return 0; -} - - -static int hvnd_ib_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, - int attr_mask, struct ib_qp_init_attr *init_attr) -{ - struct hvnd_qp *qp = to_nd_qp(ibqp); - - memset(attr, 0, sizeof(*attr)); - memset(init_attr, 0, sizeof(*init_attr)); - - attr->qp_state = qp->qp_state; - - init_attr->cap.max_send_wr = qp->max_send_wr; - init_attr->cap.max_recv_wr = qp->max_recv_wr; - - init_attr->cap.max_send_sge = qp->max_send_sge; - init_attr->cap.max_recv_sge = qp->max_recv_sge; - init_attr->cap.max_inline_data = qp->max_inline_data; - - init_attr->sq_sig_type = IB_SIGNAL_ALL_WR; - - return 0; -} - -static void hvnd_refuse_connection(struct hvnd_ep_obj *connector, int status); -static int hvnd_destroy_qp(struct ib_qp *ib_qp) -{ - int ret; - struct hvnd_qp *qp = to_nd_qp(ib_qp); - struct hvnd_dev *nd_dev = to_nd_dev(ib_qp->device); - u64 jiffies; - - if (!qp->connector) { - hvnd_warn("error: connector is NULL; skip destroying connector\n"); - goto free_qp; - } - - /* - * should we flush the qp first on ctrl-C?, - * no need to disconnect on abrupt shutdown? - */ - if (qp->qp_state != IB_QPS_ERR && qp->qp_state != IB_QPS_SQD) { - hvnd_warn("qp_state=%d, doing abrupt disconnect\n", - qp->qp_state); - hvnd_flush_qp(nd_dev, qp->uctx, qp); - - ep_stop(qp->connector); - - // now no pending activity is possible on the connector - - switch (qp->connector->cm_state) { - - case hvnd_cm_idle: - case hvnd_cm_connect_reply_refused: - case hvnd_cm_connect_request_sent: - case hvnd_cm_close_sent: - hvnd_warn("cm_state = %d not doing anything\n", - qp->connector->cm_state); - break; - - case hvnd_cm_connect_received: - hvnd_warn("cm_state = %d refusing pending connection request\n", - qp->connector->cm_state); - hvnd_refuse_connection(qp->connector, -ECONNREFUSED); - break; - - case hvnd_cm_connect_reply_sent: - case hvnd_cm_established_sent: - case hvnd_cm_accept_sent: - hvnd_warn("cm_state = %d notifying disconnect on existing connection\n", - qp->connector->cm_state); - hvnd_process_notify_disconnect(qp->connector, - STATUS_CANCELLED); - break; - - default: - hvnd_error("unknown cm_state = %d\n", - qp->connector->cm_state); - - } - goto free_connector; - } else { - hvnd_debug("qp_state=%d, doing normal disconnect\n", - qp->qp_state); - } - - if (!ep_add_work_pending(qp->connector)) - goto free_connector; - - init_completion(&qp->connector->disconnect_event); - - /* - * First issue a disconnect on the connector. - */ - - hvnd_debug("calling hvnd_connector_disconnect\n"); - ret = hvnd_connector_disconnect(nd_dev, qp->uctx, - qp->connector->ep_handle, - qp->connector); - if (ret) { - ep_del_work_pending(qp->connector); - hvnd_error("disconnect: retval is %d\n", ret); - ep_stop(qp->connector); - goto free_connector; - } - /* - * Now wait for the disconnect. - */ - jiffies = get_jiffies_64(); - wait_for_completion(&qp->connector->disconnect_event); - hvnd_debug("Completed disconnect connector=%p jiffies=%llu\n", - qp->connector, get_jiffies_64() - jiffies); - - /* - * Now free up the connector and drop the reference on uctx. - */ - - ep_stop(qp->connector); - -free_connector: - hvnd_debug("destroying connector handle: %p\n", - (void *) qp->connector->ep_handle); - hvnd_free_handle(nd_dev, qp->uctx, - qp->connector->ep_handle, - IOCTL_ND_CONNECTOR_FREE); - - hvnd_drop_uctx_ref(nd_dev, qp->uctx); - hvnd_destroy_ep(qp->connector); - qp->connector = NULL; -free_qp: - atomic_dec(&qp->refcnt); - hvnd_debug("Waiting for the ref cnt to go to 0\n"); - - wait_event(qp->wait, !atomic_read(&qp->refcnt)); - - hvnd_debug("About to destroy qp\n"); - hvnd_db_unmap_user(qp->uctx, (u64)qp->db_addr); - ib_umem_release(qp->umem); - - hvnd_debug("About to free qp\n"); - ret = hvnd_free_qp(nd_dev, qp->uctx, qp); - - if (ret == 0) { - hvnd_drop_uctx_ref(nd_dev, qp->uctx); - kfree(qp); - } else { - hvnd_error("free qp failed: ret is %d\n", ret); - } - - return ret; -} - -static struct ib_cq *hvnd_ib_create_cq(struct ib_device *ibdev, - const struct ib_cq_init_attr *attr, - struct ib_ucontext *ib_context, - struct ib_udata *udata) -{ - struct hvnd_ucontext *uctx; - struct hvnd_dev *nd_dev; - struct mlx4_ib_create_cq ucmd; - struct hvnd_cq *cq; - int ret = 0; - int entries = attr->cqe; - - uctx = to_nd_context(ib_context); - nd_dev = to_nd_dev(ibdev); - - if (entries < 1 || entries > uctx->max_cqe) { - hvnd_error("incorrct entries=%d\n", entries); - ret = -EINVAL; - goto err_done; - } - - cq = kzalloc(sizeof(*cq), GFP_KERNEL); - if (!cq) { - ret = -ENOMEM; - goto err_done; - } - - entries = roundup_pow_of_two(entries + 1); - cq->ibcq.cqe = entries - 1; - cq->entries = entries; - cq->uctx = uctx; - - if (ib_copy_from_udata(&ucmd, udata, sizeof(ucmd))) { - hvnd_error("ib_copy_from_udata failed\n"); - ret = -EFAULT; - goto err_ucpy; - } - - cq->cq_buf = (void *)ucmd.buf_addr; - cq->db_addr = (void *)ucmd.db_addr; - cq->arm_sn = 0; - - /* - * Initialize the IRP state. Need to have a separate irp state - * for CQ; for now share it with Listener/connector. - */ - ret = hvnd_init_ep(&cq->ep_object, NULL, ND_CQ, nd_dev, uctx); - - if (ret) { - hvnd_error("hvnd_init_ep failed ret=%d\n", ret); - goto err_ucpy; - } - - cq->ep_object.cq = cq; - cq->monitor = true; - - cq->umem = ib_umem_get(ib_context, ucmd.buf_addr, - (entries * uctx->cqe_size), - IB_ACCESS_LOCAL_WRITE, 1); - if (IS_ERR(cq->umem)) { - ret = IS_ERR(cq->umem); - hvnd_error("ib_umem_get failed ret=%d\n", ret); - goto err_ucpy; - } - - ret = hvnd_db_map_user(uctx, ucmd.db_addr, &cq->db_umem); - - if (ret) { - hvnd_error("hvnd_db_map_user failed ret=%d\n", ret); - goto err_db_map; - } - - ret = hvnd_create_cq(nd_dev, uctx, cq); - - if (ret) { - hvnd_error("hvnd_create_cq failed ret=%d\n", ret); - goto err_cq; - } - - cq->ep_object.ep_handle = cq->cq_handle; - - if (ib_copy_to_udata(udata, &cq->cqn, sizeof(__u32))) { - hvnd_error("ib_copy_to_udata failed\n"); - ret = -EFAULT; - goto err_ucpy_out; - } - - if (!disable_cq_notify) { - - if (!ep_add_work_pending(&cq->ep_object)) - goto err_ucpy_out; - - ret = hvnd_notify_cq(nd_dev, cq, ND_CQ_NOTIFY_ANY, - (u64)&cq->ep_object); - - if (ret) { - ep_del_work_pending(&cq->ep_object); - hvnd_error("hvnd_notify_cq failed ret=%d\n", ret); - goto err_ucpy_out; - } - } - - hvnd_acquire_uctx_ref(uctx); - - return &cq->ibcq; - -err_ucpy_out: - hvnd_destroy_cq(nd_dev, cq); - -err_cq: - hvnd_db_unmap_user(uctx, ucmd.db_addr); - -err_db_map: - ib_umem_release(cq->umem); - -err_ucpy: - kfree(cq); -err_done: - return ERR_PTR(ret); -} - -static struct ib_qp *hvnd_get_qp(struct ib_device *dev, int qpn) -{ - struct hvnd_dev *nd_dev; - struct hvnd_qp *qp = NULL; - - nd_dev = to_nd_dev(dev); - qp = get_qpp(nd_dev, qpn); - return qp ? &qp->ibqp : NULL; -} - -static int hvnd_ib_destroy_cq(struct ib_cq *ib_cq) -{ - struct hvnd_ucontext *uctx; - struct hvnd_dev *nd_dev; - struct hvnd_cq *cq; - - cq = to_nd_cq(ib_cq); - uctx = cq->uctx; - nd_dev = to_nd_dev(uctx->ibucontext.device); - - cq->monitor = false; - -// hvnd_cancel_io(&cq->ep_object); - ep_stop(&cq->ep_object); - - hvnd_deinit_ep(&cq->ep_object); - - hvnd_db_unmap_user(uctx, (u64)cq->db_addr); - ib_umem_release(cq->umem); - - hvnd_destroy_cq(nd_dev, cq); - - hvnd_drop_uctx_ref(nd_dev, uctx); - kfree(cq); - - return 0; -} - -static int hvnd_resize_cq(struct ib_cq *cq, int cqe, struct ib_udata *udata) -{ - /* - * NDDirect does not support resizing CQ. - */ - hvnd_info("check code\n"); - return -ENOSYS; -} - -static int hvnd_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc) -{ - hvnd_info("check code\n"); - return 0; -} - -static struct ib_mr *hvnd_get_dma_mr(struct ib_pd *pd, int acc) -{ - hvnd_info("check code\n"); - return NULL; -} - -static void debug_dump_umem(struct ib_umem *umem) -{ -#ifdef HVND_MEM_DEBUG - struct ib_umem_chunk *chunk; - struct scatterlist *sg; - int len, j, entry; - int shift = ffs(umem->page_size) - 1; - - hvnd_debug("umem=%p\n", umem); - hvnd_debug("context=%p length=%lu offset=%d page_size=%d writable=%d hugetlb=%d\n", - umem->context, - umem->length, - umem->offset, - umem->page_size, - umem->writable, - umem->hugetlb); - - list_for_each_entry(chunk, &umem->chunk_list, list) { - hvnd_debug("chunk->nmap=%d\n", chunk->nmap); - for (j = 0; j < chunk->nmap; ++j) { - sg = &chunk->page_list[j]; - hvnd_debug("sg_dma_len=%d sg_dma_address=%llx\n", - sg_dma_len(sg), sg_dma_address(sg)); - hvnd_debug("page_link=%lx offset=%u length=%u\n", - sg->page_link, sg->offset, sg->length); - len = sg_dma_len(&chunk->page_list[j]) >> shift; - for_each_sg(&chunk->page_list[j], sg, len, entry) { - hvnd_debug("PFN=%lu\n", - page_to_pfn(sg_page(sg))); - } - } - } -#endif -} - - -static struct ib_mr *hvnd_reg_user_mr(struct ib_pd *pd, u64 start, u64 length, - u64 virt, int acc, struct ib_udata *udata) -{ - int err = 0; - struct hvnd_ib_pd *hvndpd = to_nd_pd(pd); - struct hvnd_mr *mr; - - mr = kmalloc(sizeof(*mr), GFP_KERNEL); - if (!mr) - return ERR_PTR(-ENOMEM); - mr->pd = hvndpd; - - mr->umem = ib_umem_get(pd->uobject->context, start, length, acc, 0); - if (IS_ERR(mr->umem)) { - err = PTR_ERR(mr->umem); - hvnd_error("ib_umem_get failed ret=%d\n", err); - kfree(mr); - return ERR_PTR(err); - } - - debug_dump_umem(mr->umem); - - mr->start = start; - mr->length = length; - mr->virt = virt; - mr->acc = acc; - - hvnd_debug("start=%llx length=%llx virt=%llx acc=%d\n", - start, length, virt, acc); - - /* - * First create a memory region. - */ - err = hvnd_cr_mr(to_nd_dev(pd->device), - to_nd_context(pd->uobject->context), hvndpd->handle, - &mr->mr_handle); - if (err) { - hvnd_error("cr_mr failed; ret is %d\n", err); - goto err; - } - - err = hvnd_mr_register(to_nd_dev(pd->device), - to_nd_context(pd->uobject->context), mr); - - if (err) - goto err0; - - hvnd_acquire_uctx_ref(to_nd_context(pd->uobject->context)); - - return &mr->ibmr; - -err0: - hvnd_free_mr(to_nd_dev(pd->device), - to_nd_context(pd->uobject->context), mr->mr_handle); -err: - ib_umem_release(mr->umem); - kfree(mr); - return ERR_PTR(err); -} - - - -static int hvnd_dereg_mr(struct ib_mr *ib_mr) -{ - int ret; - struct hvnd_mr *mr = to_nd_mr(ib_mr); - struct hvnd_ucontext *uctx = to_nd_context(ib_mr->pd->uobject->context); - struct hvnd_dev *nd_dev = to_nd_dev(ib_mr->device); - - - hvnd_debug("dereg_mr entering\n"); - - ret = hvnd_deregister_mr(nd_dev, uctx, mr->mr_handle); - - if (ret) { - hvnd_error("hvnd_deregister_mr() failed: %x\n", ret); - return ret; - } - /* - * Now free up the memory region. - */ - - ret = hvnd_free_mr(nd_dev, uctx, mr->mr_handle); - if (ret) { - hvnd_error("hvnd_free_mr() failed: %x\n", ret); - return ret; - } - - ib_umem_release(mr->umem); - - hvnd_drop_uctx_ref(nd_dev, uctx); - kfree(mr); - - hvnd_debug("dereg_mr done\n"); - return 0; -} - -static struct ib_mw *hvnd_alloc_mw(struct ib_pd *pd, enum ib_mw_type type, - struct ib_udata *udata) -{ - hvnd_info("check code\n"); - return NULL; -} - -static int hvnd_dealloc_mw(struct ib_mw *mw) -{ - debug_check(__func__, __LINE__); - return 0; -} - -static int hvnd_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags) -{ - struct hvnd_ucontext *uctx; - struct hvnd_dev *nd_dev; - struct hvnd_cq *cq; - - cq = to_nd_cq(ibcq); - uctx = cq->uctx; - nd_dev = to_nd_dev(uctx->ibucontext.device); - - - debug_check(__func__, __LINE__); - - return 0; -} - -static int hvnd_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, - struct ib_send_wr **bad_wr) -{ - debug_check(__func__, __LINE__); - return 0; -} - -int hvnd_post_receive(struct ib_qp *ibqp, struct ib_recv_wr *wr, - struct ib_recv_wr **bad_wr) -{ - debug_check(__func__, __LINE__); - return 0; -} - -static int hvnd_resolve_addr(struct sockaddr_in *laddr, - struct sockaddr_in *raddr, - struct if_physical_addr *phys_addrstruct) -{ - int ret; - - phys_addrstruct->length = ETH_ALEN; - ret = hvnd_get_neigh_mac_addr((struct sockaddr *)laddr, - (struct sockaddr *)raddr, - phys_addrstruct->addr); - - hvnd_debug("Dest MAC is %pM\n", phys_addrstruct->addr); - return ret; -} - -static int hvnd_connect(struct iw_cm_id *cm_id, - struct iw_cm_conn_param *conn_param) -{ - int ret = 0; - struct hvnd_dev *nd_dev; - struct hvnd_ep_obj *ep_object; - struct sockaddr_in *raddr = (struct sockaddr_in *)&cm_id->remote_addr; - struct sockaddr_in *laddr = (struct sockaddr_in *)&cm_id->local_addr; - struct hvnd_qp *qp; - struct if_physical_addr phys_addrstruct; - union nd_sockaddr_inet dest_addr; - u64 connector_handle; - union nd_sockaddr_inet addr; - char addr_buf[50]; - - if (cm_id->remote_addr.ss_family != AF_INET) { - hvnd_error("cm_id->remote_addr.ss_family=%d not AF_INET\n", - cm_id->remote_addr.ss_family); - return -ENOSYS; - } - - qp = get_qpp(to_nd_dev(cm_id->device), conn_param->qpn); - - if (!qp) { - hvnd_error("failed to find qp conn_param->qpn=%d\n", - conn_param->qpn); - return -EINVAL; - } - - cm_id->provider_data = qp; - cm_id->add_ref(cm_id); - qp->cm_id = cm_id; - - /* - * Set the read/write limits. - * Can we change the limits on a created QP? Luke? - */ - nd_dev = to_nd_dev(cm_id->device); - ep_object = hvnd_setup_ep(cm_id, ND_CONNECTOR, nd_dev, qp->uctx); - hvnd_debug("active connection: local irp is %d\n", - ep_object->local_irp); - if (!ep_object) { - hvnd_error("hvnd_setup_ep failure\n"); - ret = -ENOMEM; - goto err_limit; - } - - ret = hvnd_cr_connector(nd_dev, qp->uctx, - &connector_handle); - - if (ret) { - hvnd_error("hvnd_cr_connector failure ret=%d\n", ret); - goto err_cr_connector; - } - - hvnd_acquire_uctx_ref(qp->uctx); - ep_object->ep_handle = connector_handle; - ep_object->incoming = false; - qp->connector = ep_object; - - /* - * Bind the local address to the connector. - */ - hvnd_debug("Connect local address is %s\n", - debug_inet_ntoa(laddr->sin_addr, addr_buf)); - - memcpy(&addr.ipv4, laddr, sizeof(struct sockaddr_in)); - hvnd_debug("CONNECT AF %d port %d addr %s\n", - addr.ipv4.sin_family, addr.ipv4.sin_port, - debug_inet_ntoa(addr.ipv4.sin_addr, addr_buf)); - - ret = hvnd_bind_connector(nd_dev, qp->uctx, - connector_handle, - &addr); - - if (ret) { - hvnd_error("hvnd_bind_connector failed ret=%d\n", ret); - goto err_bind_connector; - } - - ret = hvnd_resolve_addr(laddr, raddr, &phys_addrstruct); - if (ret) { - hvnd_error("hvnd_resolve_addr failed ret=%d\n", ret); - goto err_bind_connector; - } - - memcpy(&dest_addr.ipv4, raddr, sizeof(struct sockaddr_in)); - - - - /* - * Now attempt to connect. - */ - - hvnd_debug("About to initiate connection\n"); - - if (!ep_add_work_pending(ep_object)) - goto err_bind_connector; - - ep_object->cm_state = hvnd_cm_connect_received; - ret = hvnd_connector_connect(nd_dev, qp->uctx, - ep_object->ep_handle, - conn_param->ird, conn_param->ord, - conn_param->private_data_len, - (u8 *)conn_param->private_data, - qp->qp_handle, - &phys_addrstruct, &dest_addr, - ep_object); - - if (ret == 0) - return 0; - - ep_object->cm_state = hvnd_cm_idle; - ep_del_work_pending(ep_object); - hvnd_error("hvnd_connector_connect failed ret=%d\n", ret); - -err_bind_connector: - qp->connector = NULL; - hvnd_free_connector(nd_dev, qp->uctx, - connector_handle); - hvnd_drop_uctx_ref(nd_dev, qp->uctx); - -err_cr_connector: - kfree(ep_object); - -err_limit: - cm_id->provider_data = NULL; - qp->cm_id = NULL; - cm_id->rem_ref(cm_id); - return ret; -} - -static int hvnd_accept_cr(struct iw_cm_id *cm_id, - struct iw_cm_conn_param *conn_param) -{ - int ret = 0; - struct hvnd_dev *nd_dev; - struct hvnd_qp *qp; - struct hvnd_ep_obj *connector; - enum ibv_qp_state new_qp_state; - - hvnd_debug("Accepting connection - PASSIVE\n"); - nd_dev = to_nd_dev(cm_id->device); - qp = get_qpp(to_nd_dev(cm_id->device), conn_param->qpn); - - if (!qp) { - hvnd_error("get_qpp failed conn_param->qpn=%d\n", - conn_param->qpn); - return -EINVAL; - } - - - connector = (struct hvnd_ep_obj *)cm_id->provider_data; - qp->connector = connector; - - if (connector == NULL) { - hvnd_error("NULL connector!\n"); - return -EINVAL; - } - hvnd_debug("connector's cm_id is %p caller cm_id=%p\n", - connector->cm_id, cm_id); - - connector->cq = qp->recv_cq; - - - /* - * Setup state for the accepted connection. - */ - cm_id->add_ref(cm_id); - connector->cm_id = cm_id; - if (conn_param == NULL) { - hvnd_error("NULL conn_param!\n"); - return -EINVAL; - } - - connector->ord = conn_param->ord; - connector->ird = conn_param->ird; - - if (!ep_add_work_pending(connector)) - goto error; - - init_completion(&connector->connector_accept_event); - - ret = hvnd_connector_accept(nd_dev, qp->uctx, connector->ep_handle, - qp->qp_handle, conn_param->ird, - conn_param->ord, - conn_param->private_data_len, - conn_param->private_data, - &new_qp_state, connector); - - if (ret) { - ep_del_work_pending(connector); - hvnd_error("connector accept failed\n"); - goto error; - } - - wait_for_completion(&connector->connector_accept_event); - ret = connector->connector_accept_status; - - if (ret) { - hvnd_error("connector_accept failed status=%x\n", ret); - ret = -EIO; - goto error; - } - - hvnd_debug("Passive Connection Accepted; new qp state is %d\n", - new_qp_state); - connector->cm_state = hvnd_cm_accept_sent; - return 0; - -error: - ep_stop(connector); - connector->cm_id = NULL; - connector->cm_state = hvnd_cm_idle; - - qp->connector = NULL; - cm_id->rem_ref(cm_id); - - return ret; -} - -static int hvnd_reject_cr(struct iw_cm_id *cm_id, const void *pdata, - u8 pdata_len) -{ - debug_check(__func__, __LINE__); - return 0; -} - -void hvnd_process_disconnect(struct hvnd_ep_obj *ep_object, int status) -{ - struct iw_cm_event cm_event; - - switch (status) { - case STATUS_SUCCESS: - case STATUS_CANCELLED: - break; - - default: - hvnd_warn("disconnect complete failed: status:%d\n", status); - } - - - hvnd_debug("active disconnect processed\n"); - memset(&cm_event, 0, sizeof(cm_event)); - - complete(&ep_object->disconnect_event); -} - - -void hvnd_process_notify_disconnect(struct hvnd_ep_obj *ep_object, int status) -{ - struct iw_cm_event cm_event; - - // make sure we only disconnect once - if (atomic_xchg(&ep_object->disconnect_notified, 1)) - return; - - /* - * Turn off CQ monitoring. - */ - if (ep_object->cq) - ep_object->cq->monitor = false; - - switch (ep_object->cm_state) { - case hvnd_cm_connect_reply_sent: - case hvnd_cm_established_sent: - case hvnd_cm_accept_sent: - break; - - default: - hvnd_error("unexpected cm_state=%d\n", ep_object->cm_state); - return; - } - - switch (status) { - case STATUS_SUCCESS: - case STATUS_CANCELLED: - case STATUS_DISCONNECTED: - break; - - default: - hvnd_warn("notify disconnect complete failed: status:%d\n", - status); - } - - hvnd_debug("passive disconnect notified\n"); - memset(&cm_event, 0, sizeof(cm_event)); - - /* - * Other end disconnected. - * Connection has been disconnected; - * notify the cm layer. - */ - cm_event.status = -ECONNRESET; - cm_event.event = IW_CM_EVENT_CLOSE; - - if ((ep_object->cm_id) && - (ep_object->cm_id->event_handler)) { - - ep_object->cm_id->event_handler(ep_object->cm_id, &cm_event); - - ep_object->cm_id->rem_ref(ep_object->cm_id); - ep_object->cm_state = hvnd_cm_close_sent; - } -} - -void hvnd_process_connector_accept(struct hvnd_ep_obj *ep_object, int status) -{ - struct iw_cm_event cm_event; - int ret; - - /* - * This is the problem area the return status may be: - * 1: 0xc00000b5 (3221225653) - {Device Timeout} The specified I/O - * operation on %hs was not completed before the time-out period - * expired - * 2: NTSTATUS 0xc0000241 (3221226049) - The transport connection was - * aborted by the local system. - * If we do nothing here, iwcm will wait for IW_CM_EVENT_ESTABLISHED - * forever, and unable to clean shutdown need to fail the call earlier - * on accept. - */ - - ep_object->connector_accept_status = status; - - if (status) { - hvnd_error("Connector accept failed; status is %x\n", status); - complete(&ep_object->connector_accept_event); - return; - } - - memset(&cm_event, 0, sizeof(cm_event)); - cm_event.event = IW_CM_EVENT_ESTABLISHED; - cm_event.ird = ep_object->ird; - cm_event.ord = ep_object->ord; - cm_event.provider_data = (void *)ep_object; - - /* - * We have successfully passively accepted the - * incoming connection. - */ - - hvnd_debug("Passive connection accepted!!\n"); - if ((ep_object->cm_id) && - (ep_object->cm_id->event_handler)) { - ep_object->cm_id->event_handler(ep_object->cm_id, &cm_event); - ep_object->cm_state = hvnd_cm_established_sent; - } - - complete(&ep_object->connector_accept_event); - - /* - * Request notification if the other end - * were to disconnect. - */ - if (!ep_add_work_pending(ep_object)) - return; - - ret = hvnd_connector_notify_disconnect(ep_object->nd_dev, - ep_object->uctx, - ep_object->ep_handle, - ep_object); - - if (ret) { - ep_del_work_pending(ep_object); - hvnd_error("Connector notify disconnect failed; ret: %d\n", - ret); - } -} - - -void hvnd_process_cq_event_pending(struct hvnd_ep_obj *ep_object, - int status) -{ - - struct ib_cq *ibcq; - struct hvnd_cq *cq; - - cq = ep_object->cq; - ibcq = &ep_object->cq->ibcq; - - if (!cq->monitor) - return; - - // call the previous CQ complete - if ((status == STATUS_PENDING) && cq->upcall_pending && - ibcq->comp_handler) { - ibcq->comp_handler(ibcq, ibcq->cq_context); - cq->upcall_pending = false; - hvnd_debug("CQ comp_handler called arm_sn=%d\n", cq->arm_sn); - } - - if ((status != STATUS_PENDING) && ibcq->comp_handler && - ibcq->cq_context) { - ibcq->comp_handler(ibcq, ibcq->cq_context); - hvnd_error("CQ comp_handler called status=%x\n", status); - } -} - -void hvnd_process_cq_event_complete(struct hvnd_ep_obj *ep_object, - int status) -{ - struct ib_cq *ibcq; - struct hvnd_cq *cq; - int ret; - - cq = ep_object->cq; - ibcq = &ep_object->cq->ibcq; - - // call hte previous CQ complete - if (cq->upcall_pending && ibcq->comp_handler) { - ibcq->comp_handler(ibcq, ibcq->cq_context); - cq->upcall_pending = false; - hvnd_debug("CQ comp_handler called arm_sn=%d\n", cq->arm_sn); - } - - cq->upcall_pending = true; - if (!ep_add_work_pending(ep_object)) - return; - - ret = hvnd_notify_cq(ep_object->nd_dev, - ep_object->cq, - ND_CQ_NOTIFY_ANY, - (u64)ep_object); - - if (ret) { - ep_del_work_pending(ep_object); -// hvnd_manage_io_state(ep_object, true); - hvnd_error("hvnd_notify_cq failed ret=%d\n", ret); - } - - if ((status != 0) && (status != STATUS_CANCELLED)) { - if (ibcq->event_handler) { - struct ib_event event; - - event.device = ibcq->device; - event.event = IB_EVENT_CQ_ERR; - event.element.cq = ibcq; - ibcq->event_handler(&event, ibcq->cq_context); - - hvnd_warn("CQ event_handler called status=%x\n", - status); - } - } -} - -int init_cm_event(struct hvnd_ep_obj *ep_object, struct iw_cm_event *cm_event, - int event) -{ - struct sockaddr_in *laddr = (struct sockaddr_in *)&cm_event->local_addr; - struct sockaddr_in *raddr = (struct sockaddr_in *)&cm_event->remote_addr; - struct nd_read_limits rd_limits; - union nd_sockaddr_inet local_addr; - union nd_sockaddr_inet remote_addr; - int ret; - - /* - * Now get the local address. - */ - ret = hvnd_connector_get_local_addr(ep_object->nd_dev, - ep_object->uctx, - ep_object->ep_handle, - &local_addr); - - if (ret) { - hvnd_error("Connector get addr failed; ret: %d\n", ret); - return ret; - } - /* - * Now get the remote address. - */ - ret = hvnd_connector_get_peer_addr(ep_object->nd_dev, - ep_object->uctx, - ep_object->ep_handle, - &remote_addr); - - if (ret) { - hvnd_error("Connector get peer addr failed; ret: %d\n", ret); - return ret; - } - - /* - * Get other connection parameters. - */ - - ret = hvnd_connector_get_rd_limits(ep_object->nd_dev, - ep_object->uctx, - ep_object->ep_handle, - &rd_limits); - - if (ret) { - hvnd_error("Connector rd limits failed; ret: %d\n", ret); - return ret; - } - - /* - * XXXKYS: Luke: What about the length of the priv data? - */ - ret = hvnd_connector_get_priv_data(ep_object->nd_dev, - ep_object->uctx, - ep_object->ep_handle, - ep_object->priv_data); - - if (ret) { - hvnd_error("Connector get priv data failed; ret: %d\n", ret); - return ret; - } - /* - * Initialize CM structure. - */ - laddr->sin_addr.s_addr = local_addr.ipv4.sin_addr.s_addr; - hvnd_debug("Local addr is %d\n", laddr->sin_addr.s_addr); - laddr->sin_port = local_addr.ipv4.sin_port; - laddr->sin_family = AF_INET; - - raddr->sin_addr.s_addr = remote_addr.ipv4.sin_addr.s_addr; - hvnd_debug("Remote addr is %d\n", raddr->sin_addr.s_addr); - raddr->sin_port = remote_addr.ipv4.sin_port; - raddr->sin_family = AF_INET; - - // KYS; LUke: is it always 148 bytes? - cm_event->private_data_len = MAX_PRIVATE_DATA_LEN; - cm_event->private_data = ep_object->priv_data; - - cm_event->ird = rd_limits.inbound; - cm_event->ord = rd_limits.outbound; - cm_event->event = event; - - ep_object->ird = cm_event->ird; - ep_object->ord = cm_event->ord; - - return 0; -} - -static void hvnd_refuse_connection(struct hvnd_ep_obj *connector, int status) -{ - struct iw_cm_event cm_event; - - memset(&cm_event, 0, sizeof(cm_event)); - - cm_event.event = IW_CM_EVENT_CONNECT_REPLY; - cm_event.status = status; - - hvnd_debug("returning status %d on connector %p\n", status, connector); - - if (connector->cm_id && connector->cm_id->event_handler) { - connector->cm_id->event_handler(connector->cm_id, &cm_event); - connector->cm_id->rem_ref(connector->cm_id); - connector->cm_state = hvnd_cm_connect_reply_refused; - } -} - -void hvnd_process_events(struct work_struct *work) -{ - struct hvnd_work *wrk; - struct nd_read_limits rd_limits; - struct hvnd_ep_obj *ep_object; - struct hvnd_ep_obj *parent; - struct iw_cm_event cm_event; - struct sockaddr_in *laddr = (struct sockaddr_in *)&cm_event.local_addr; - struct sockaddr_in *raddr = (struct sockaddr_in *)&cm_event.remote_addr; - struct ndv_packet_hdr_control_1 *ctrl_hdr; - union nd_sockaddr_inet local_addr; - union nd_sockaddr_inet remote_addr; - struct pkt_nd_get_connection_listener *connection_pkt; - struct iw_cm_id *cm_id = NULL; - int status; - int ioctl; - int ret; - char priv_data[MAX_PRIVATE_DATA_LEN]; - enum ibv_qp_state new_qp_state; - struct incoming_pkt *incoming_pkt; - unsigned long flags; - - memset(&cm_event, 0, sizeof(cm_event)); - memset(&priv_data, 0, MAX_PRIVATE_DATA_LEN); - - - wrk = container_of(work, struct hvnd_work, work); - - /* - * Now call into the connection manager. - */ - ep_object = (struct hvnd_ep_obj *)wrk->callback_arg; - parent = ep_object->parent; - -process_next: - incoming_pkt = NULL; - spin_lock_irqsave(&ep_object->incoming_pkt_list_lock, flags); - if (!list_empty(&ep_object->incoming_pkt_list)) { - incoming_pkt = list_first_entry(&ep_object->incoming_pkt_list, - struct incoming_pkt, - list_entry); - list_del(&incoming_pkt->list_entry); - } - spin_unlock_irqrestore(&ep_object->incoming_pkt_list_lock, flags); - if (incoming_pkt == NULL) - return; - - ctrl_hdr = (struct ndv_packet_hdr_control_1 *)incoming_pkt->pkt; - status = ctrl_hdr->io_status; - ioctl = ctrl_hdr->io_cntrl_code; - - hvnd_debug("Process Events IOCTL is: %s; iostatus failure: %x in work queue\n", - hvnd_get_op_name(ioctl), status); - - if (status != 0) { - bool log_error = true; - - if (ioctl == IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT && - status == STATUS_DISCONNECTED) // expected - log_error = false; - - if (log_error) - hvnd_warn("Process Events IOCTL is: %s; iostatus failure: %x\n", - hvnd_get_op_name(ioctl), status); - } - - cm_event.status = status; - - switch (ep_object->type) { - case ND_CONNECTOR: - switch (ioctl) { - case IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST: - - if (ep_object->parent != NULL) { - - // Do nothing with this connection request if - // listener is stopping - if (!ep_add_work_pending(ep_object->parent)) - break; - - cm_id = ep_object->parent->cm_id; //Listener - } - - connection_pkt = (struct pkt_nd_get_connection_listener *) ctrl_hdr; - - if ((status == 0) || (status == STATUS_CANCELLED)) { - hvnd_get_incoming_connections(ep_object->parent, - ep_object->parent->nd_dev, - ep_object->uctx); - } - - if (status) - goto get_connection_request_done; - - /* - * Now get the local address. - */ - ret = hvnd_connector_get_local_addr(ep_object->nd_dev, - ep_object->uctx, - ep_object->ep_handle, - &local_addr); - - if (ret) { - hvnd_error("Connector get addr failed; ret: %d\n", - ret); - goto get_connection_request_done; - } - /* - * Now get the remote address. - */ - ret = hvnd_connector_get_peer_addr(ep_object->nd_dev, - ep_object->uctx, - ep_object->ep_handle, - &remote_addr); - - if (ret) { - hvnd_error("Connector get peer addr failed; ret: %d\n", - ret); - goto get_connection_request_done; - } - /* - * Get other connection parameters. - */ - - ret = hvnd_connector_get_rd_limits(ep_object->nd_dev, - ep_object->uctx, - ep_object->ep_handle, - &rd_limits); - - if (ret) { - hvnd_error("Connector rd imits failed; ret: %d\n", - ret); - goto get_connection_request_done; - } - - /* - * XXXKYS: Luke: What about the length of the priv data? - */ - ret = hvnd_connector_get_priv_data( - ep_object->nd_dev, ep_object->uctx, - ep_object->ep_handle, ep_object->priv_data); - - if (ret) { - hvnd_error("Connector get priv data failed; ret: %d\n", - ret); - goto get_connection_request_done; - } - - cm_event.event = IW_CM_EVENT_CONNECT_REQUEST; - cm_event.provider_data = (void *)ep_object; - - laddr->sin_addr.s_addr = local_addr.ipv4.sin_addr.s_addr; - hvnd_debug("Local addr is %d\n", - laddr->sin_addr.s_addr); - laddr->sin_port = local_addr.ipv4.sin_port; - laddr->sin_family = AF_INET; - - raddr->sin_addr.s_addr = remote_addr.ipv4.sin_addr.s_addr; - hvnd_debug("Remote addr is %d\n", - raddr->sin_addr.s_addr); - raddr->sin_port = remote_addr.ipv4.sin_port; - raddr->sin_family = AF_INET; - - // KYS; LUke: is it always 148 bytes? - cm_event.private_data_len = MAX_PRIVATE_DATA_LEN; - cm_event.private_data = ep_object->priv_data; - - cm_event.ird = rd_limits.inbound; - cm_event.ord = rd_limits.outbound; - - ep_object->ird = cm_event.ird; - ep_object->ord = cm_event.ord; - - - if ((cm_id != NULL) && cm_id->event_handler) { - cm_id->event_handler(cm_id, &cm_event); - ep_object->cm_state = hvnd_cm_connect_request_sent; - } - -get_connection_request_done: - if (ep_object->parent != NULL) - ep_del_work_pending(ep_object->parent); - break; - - case IOCTL_ND_CONNECTOR_CONNECT: - - cm_event.event = IW_CM_EVENT_CONNECT_REPLY; - if (status == STATUS_TIMEOUT && - ep_object->connector_connect_retry < 3) { - //TIMEOUT retry - - if (!ep_add_work_pending(ep_object)) - goto refuse_connection; - - hvnd_warn("Connector connect timed out, reconnecting... retry count: %d\n", - ep_object->connector_connect_retry); - ep_object->connector_connect_retry++; - ret = hvnd_send_ioctl_pkt(ep_object->nd_dev, - &ep_object->connector_connect_pkt.hdr, - sizeof(ep_object->connector_connect_pkt), - (u64)&ep_object->connector_connect_pkt); - - if (ret) { - hvnd_error("Connector on time out failed: %d\n", - ret); - ep_del_work_pending(ep_object); - goto refuse_connection; - } - break; - } - -refuse_connection: - if (status) { - cm_event.status = -ECONNREFUSED; - if (status == STATUS_TIMEOUT) - cm_event.status = -ETIMEDOUT; - - hvnd_refuse_connection(ep_object, - cm_event.status); - break; - } - - hvnd_debug("ACTIVE Connection ACCEPTED\n"); - ret = init_cm_event(ep_object, &cm_event, - IW_CM_EVENT_CONNECT_REPLY); - if (ret) { - hvnd_error("init_cm_event failed ret=%d\n", - ret); - goto process_done; - } - - ret = hvnd_connector_complete_connect(ep_object->nd_dev, - ep_object->uctx, - ep_object->ep_handle, - &new_qp_state); - if (ret) { - hvnd_error("connector_complete failed\n"); - goto process_done; - } - - cm_event.provider_data = (void *)ep_object; - - if ((ep_object->cm_id) && - (ep_object->cm_id->event_handler)) { - ep_object->cm_id->event_handler( - ep_object->cm_id, &cm_event); - ep_object->cm_state = hvnd_cm_connect_reply_sent; - } - /* - * Rquest notification if the other end - * were to disconnect. - */ - if (!ep_add_work_pending(ep_object)) - goto process_done; - - ret = hvnd_connector_notify_disconnect( - ep_object->nd_dev, ep_object->uctx, - ep_object->ep_handle, ep_object); - - if (ret) { - ep_del_work_pending(ep_object); - hvnd_error("Connector notify disconnect failed; ret: %d\n", - ret); - } - - break; - - case IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT: - hvnd_process_notify_disconnect(ep_object, status); - break; - - - default: - hvnd_error("Unknown Connector IOCTL\n"); - break; - } - break; - default: - hvnd_error("Unknown endpoint object\n"); - break; - } -process_done: - kfree(incoming_pkt); - ep_del_work_pending(ep_object); - - goto process_next; -} - - -static struct hvnd_ep_obj *hvnd_setup_ep(struct iw_cm_id *cm_id, int ep_type, - struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx) -{ - struct hvnd_ep_obj *ep_object; - int ret; - - ep_object = kzalloc(sizeof(struct hvnd_ep_obj), GFP_KERNEL); - - if (!ep_object) - return NULL; - - ret = hvnd_init_ep(ep_object, cm_id, ep_type, nd_dev, uctx); - - if (ret) { - hvnd_error("hvnd_init_ep failed ret=%d\n", ret); - kfree(ep_object); - return NULL; - } - - return ep_object; -} - -/* - * return value: - * true: I/O state is stopped, we should not do upcall - * flase: I/O state is running and normal - */ -#if 0 -static bool hvnd_manage_io_state(struct hvnd_ep_obj *ep, bool failure) -{ - unsigned long flags; - - spin_lock_irqsave(&ep->ep_lk, flags); - if (ep->to_be_destroyed) { - hvnd_warn("ep being destroyed\n"); - if (ep->io_outstanding) { - hvnd_warn("ep being destroyed i/O pending waking up on %p\n", - &ep->block_event); - complete(&ep->block_event); - ep->io_outstanding = false; - } - spin_unlock_irqrestore(&ep->ep_lk, flags); - return true; - } - if (!failure) - ep->io_outstanding = true; - spin_unlock_irqrestore(&ep->ep_lk, flags); - return false; -} -#endif - -static int hvnd_get_incoming_connections(struct hvnd_ep_obj *listener, - struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx) -{ - struct hvnd_ep_obj *connector; - u64 connector_handle; - int ret; - - /* - * First handle the protocol for - * destruction - outstanding I/O. - */ - -// if (hvnd_manage_io_state(listener, false)) -// return 0; - /* - * Create a connector. - */ - connector = hvnd_setup_ep(listener->cm_id, ND_CONNECTOR, nd_dev, uctx); - if (!connector) { - hvnd_error("hvnd_setup_ep failed\n"); - ret = -ENOMEM; - goto con_alloc_err; - } - - ret = hvnd_cr_connector(nd_dev, uctx, - &connector_handle); - if (ret) { - hvnd_error("hvnd_cr_connector failed ret=%d\n", ret); - goto con_cr_err; - } - - /* - * Now get a connection if one is pending. - */ - connector->ep_handle = connector_handle; - connector->parent = listener; - - if (!ep_add_work_pending(connector)) - goto get_connection_err; - - ret = hvnd_get_connection_listener(nd_dev, uctx, - listener->ep_handle, - connector_handle, - (u64)connector); - - if (ret) { - hvnd_debug("listener_get_connection failed\n"); - ep_del_work_pending(connector); - goto get_connection_err; - } - - hvnd_acquire_uctx_ref(uctx); - listener->outstanding_handle = connector_handle; - listener->outstanding_ep = connector; - hvnd_debug("outstanding handle is %p\n", (void *)connector_handle); - return 0; - -get_connection_err: - hvnd_free_handle(nd_dev, uctx, - connector_handle, - IOCTL_ND_CONNECTOR_FREE); - -con_cr_err: - kfree(connector); -con_alloc_err: -// hvnd_manage_io_state(listener, true); - return ret; -} - -static int hvnd_create_listen(struct iw_cm_id *cm_id, int backlog) -{ - int ret = 0; - struct hvnd_dev *nd_dev; - struct hvnd_ucontext *uctx; - struct hvnd_ep_obj *ep_object; - union nd_sockaddr_inet addr; - union nd_sockaddr_inet local_addr; - u64 listener_handle; - struct sockaddr_in *laddr = (struct sockaddr_in *)&cm_id->local_addr; - union nd_sockaddr_inet og_addr; - - - nd_dev = to_nd_dev(cm_id->device); - uctx = get_uctx(nd_dev, current_pid()); - hvnd_debug("uctx is %p; pid is %d\n", uctx, current_pid()); - - if (cm_id->local_addr.ss_family != AF_INET) { - hvnd_error("cm_id->local_addr.ss_family =%d not AF_INET\n", - cm_id->local_addr.ss_family); - return -ENOSYS; - } - - /* - * If the local address is LOOPBACK or INADDR_ANY, get an an address - * to bind the listener. For now, just get the first address - * available. - */ - - if (IN_LOOPBACK(ntohl(laddr->sin_addr.s_addr)) || - (laddr->sin_addr.s_addr == INADDR_ANY)) { - - hvnd_debug("need to get an address\n"); - ret = hvnd_get_outgoing_rdma_addr(nd_dev, uctx, &og_addr); - - if (ret) { - hvnd_error("failed to get the og address\n"); - return ret; - } - - laddr->sin_addr.s_addr = og_addr.ipv4.sin_addr.s_addr; - } - - cm_id->add_ref(cm_id); - - ep_object = hvnd_setup_ep(cm_id, ND_LISTENER, nd_dev, uctx); - - if (!ep_object) { - hvnd_error("hvnd_setup_ep returned NULL\n"); - goto alloc_err; - } - - ret = hvnd_cr_listener(nd_dev, uctx, - &listener_handle); - if (ret) { - hvnd_error("hvnd_cr_listener failed ret=%d\n", ret); - goto cr_err; - } - - ep_object->ep_handle = listener_handle; - - cm_id->provider_data = ep_object; - - /* - * Now bind the listener. - * IPV4 support only. - */ - memcpy(&addr.ipv4, laddr, sizeof(struct sockaddr_in)); - - ret = hvnd_bind_listener(nd_dev, uctx, - listener_handle, - &addr); - if (ret) { - hvnd_error("hvnd_bind_listener failed ret=%d\n", ret); - goto bind_err; - } - - /* - * Now get the local address. - */ - ret = hvnd_get_addr_listener(nd_dev, uctx, - listener_handle, - &local_addr); - if (ret) { - hvnd_error("hvnd_get_addr_listener failed ret=%d\n", ret); - goto bind_err; - } - - /* - * Now put the listener in the listen mode. - */ - - ret = hvnd_listen_listener(nd_dev, uctx, - listener_handle, - backlog); - - if (ret) { - hvnd_error("hvnd_listen_listener failed ret=%d\n", ret); - goto bind_err; - } - - - /* - * Now get a pending connection if one is pending. - */ - ret = hvnd_get_incoming_connections(ep_object, nd_dev, uctx); - if (ret) { - hvnd_error("hvnd_get_incoming_connections failed ret=%d\n", - ret); - goto bind_err; - } - - hvnd_acquire_uctx_ref(uctx); - hvnd_debug("cm_id=%p\n", cm_id); - return 0; - -bind_err: - hvnd_free_handle(nd_dev, uctx, - listener_handle, - IOCTL_ND_LISTENER_FREE); -cr_err: - kfree(ep_object); -alloc_err: - cm_id->provider_data = NULL; - cm_id->rem_ref(cm_id); - return ret; -} - -static int hvnd_destroy_listen(struct iw_cm_id *cm_id) -{ - struct hvnd_dev *nd_dev; - struct hvnd_ucontext *uctx; - struct hvnd_ep_obj *ep_object; - - nd_dev = to_nd_dev(cm_id->device); - - ep_object = (struct hvnd_ep_obj *)cm_id->provider_data; - - hvnd_debug("uctx is %p\n", ep_object->uctx); - hvnd_debug("Destroying Listener cm_id=%p\n", cm_id); - uctx = ep_object->uctx; - - // make sure there is nothing in progress on this ep - ep_stop(ep_object); - - hvnd_free_handle(nd_dev, uctx, - ep_object->ep_handle, - IOCTL_ND_LISTENER_FREE); - - /* - * We may have an ouststanding connector for - * incoming connection requests; clean it up. - */ - - if (ep_object->outstanding_handle != 0) { - - // make sure there is nothing in progress on this ep - ep_stop(ep_object->outstanding_ep); - - hvnd_free_handle(nd_dev, uctx, - ep_object->outstanding_handle, - IOCTL_ND_CONNECTOR_FREE); - - - hvnd_drop_uctx_ref(nd_dev, uctx); - hvnd_destroy_ep(ep_object->outstanding_ep); - } - - /* - * Now everything should have stopped - */ - - cm_id->rem_ref(cm_id); - hvnd_destroy_ep(ep_object); - cm_id->provider_data = NULL; - hvnd_drop_uctx_ref(nd_dev, uctx); - - hvnd_debug("cm_id=%p\n", cm_id); - return 0; -} - -static void hvnd_qp_add_ref(struct ib_qp *ibqp) -{ - struct hvnd_qp *qp = to_nd_qp(ibqp); - - atomic_inc(&qp->refcnt); -} - -void hvnd_qp_rem_ref(struct ib_qp *ibqp) -{ - struct hvnd_qp *qp = to_nd_qp(ibqp); - - if (atomic_dec_and_test(&qp->refcnt)) - wake_up(&qp->wait); -} - -static DEVICE_ATTR(hw_rev, 0444, hvnd_show_rev, NULL); -static DEVICE_ATTR(fw_ver, 0444, hvnd_show_fw_ver, NULL); -static DEVICE_ATTR(hca_type, 0444, hvnd_show_hca, NULL); -static DEVICE_ATTR(board_id, 0444, hvnd_show_board, NULL); - -static struct device_attribute *hvnd_class_attributes[] = { - &dev_attr_hw_rev, - &dev_attr_fw_ver, - &dev_attr_hca_type, - &dev_attr_board_id, -}; - -int hvnd_register_device(struct hvnd_dev *dev, char *ip_addr, char *mac_addr) -{ - int ret; - - dev->ibdev.owner = THIS_MODULE; - dev->device_cap_flags = IB_DEVICE_LOCAL_DMA_LKEY | IB_DEVICE_MEM_WINDOW; - dev->ibdev.local_dma_lkey = 0; - dev->ibdev.uverbs_cmd_mask = - (1ull << IB_USER_VERBS_CMD_GET_CONTEXT) | - (1ull << IB_USER_VERBS_CMD_QUERY_DEVICE) | - (1ull << IB_USER_VERBS_CMD_QUERY_PORT) | - (1ull << IB_USER_VERBS_CMD_ALLOC_PD) | - (1ull << IB_USER_VERBS_CMD_DEALLOC_PD) | - (1ull << IB_USER_VERBS_CMD_REG_MR) | - (1ull << IB_USER_VERBS_CMD_DEREG_MR) | - (1ull << IB_USER_VERBS_CMD_CREATE_COMP_CHANNEL) | - (1ull << IB_USER_VERBS_CMD_CREATE_CQ) | - (1ull << IB_USER_VERBS_CMD_DESTROY_CQ) | - (1ull << IB_USER_VERBS_CMD_REQ_NOTIFY_CQ) | - (1ull << IB_USER_VERBS_CMD_CREATE_QP) | - (1ull << IB_USER_VERBS_CMD_MODIFY_QP) | - (1ull << IB_USER_VERBS_CMD_QUERY_QP) | - (1ull << IB_USER_VERBS_CMD_POLL_CQ) | - (1ull << IB_USER_VERBS_CMD_DESTROY_QP) | - (1ull << IB_USER_VERBS_CMD_POST_SEND) | - (1ull << IB_USER_VERBS_CMD_POST_RECV); - dev->ibdev.node_type = RDMA_NODE_RNIC; - memcpy(dev->ibdev.node_desc, HVND_NODE_DESC, sizeof(HVND_NODE_DESC)); - memcpy(&dev->ibdev.node_guid, mac_addr, 6); - dev->ibdev.phys_port_cnt = 1; //dev->nports; - dev->ibdev.num_comp_vectors = 1; - dev->ibdev.query_device = hvnd_query_device; - dev->ibdev.query_port = hvnd_query_port; - dev->ibdev.get_link_layer = hvnd_get_link_layer; - dev->ibdev.query_pkey = hvnd_query_pkey; - dev->ibdev.query_gid = hvnd_query_gid; - dev->ibdev.alloc_ucontext = hvnd_alloc_ucontext; - dev->ibdev.dealloc_ucontext = hvnd_dealloc_ucontext; - dev->ibdev.mmap = hvnd_mmap; - dev->ibdev.alloc_pd = hvnd_allocate_pd; - dev->ibdev.dealloc_pd = hvnd_deallocate_pd; - dev->ibdev.create_ah = hvnd_ah_create; - dev->ibdev.destroy_ah = hvnd_ah_destroy; - dev->ibdev.create_qp = hvnd_ib_create_qp; - dev->ibdev.modify_qp = hvnd_ib_modify_qp; - dev->ibdev.query_qp = hvnd_ib_query_qp; - dev->ibdev.destroy_qp = hvnd_destroy_qp; - dev->ibdev.create_cq = hvnd_ib_create_cq; - dev->ibdev.destroy_cq = hvnd_ib_destroy_cq; - dev->ibdev.resize_cq = hvnd_resize_cq; - dev->ibdev.poll_cq = hvnd_poll_cq; - dev->ibdev.get_dma_mr = hvnd_get_dma_mr; - dev->ibdev.reg_user_mr = hvnd_reg_user_mr; - dev->ibdev.dereg_mr = hvnd_dereg_mr; - dev->ibdev.alloc_mw = hvnd_alloc_mw; - dev->ibdev.dealloc_mw = hvnd_dealloc_mw; - dev->ibdev.attach_mcast = hvnd_multicast_attach; - dev->ibdev.detach_mcast = hvnd_multicast_detach; - dev->ibdev.process_mad = hvnd_process_mad; - dev->ibdev.req_notify_cq = hvnd_arm_cq; - dev->ibdev.post_send = hvnd_post_send; - dev->ibdev.post_recv = hvnd_post_receive; - dev->ibdev.uverbs_abi_ver = MLX4_IB_UVERBS_ABI_VERSION; - - dev->ibdev.get_port_immutable = hvnd_get_port_immutable; - - //DMA ops for mapping all possible addresses - dev->ibdev.dev.parent = &(dev->hvdev->device); - dev->ibdev.dev.dma_ops = &vmbus_dma_ops; - dev->ibdev.dev.dma_mask = (u64 *) DMA_BIT_MASK(64); - dev->ibdev.dev.coherent_dma_mask = DMA_BIT_MASK(64); - - dev->ibdev.iwcm = kmalloc(sizeof(struct iw_cm_verbs), GFP_KERNEL); - if (!dev->ibdev.iwcm) - return -ENOMEM; - - dev->ibdev.iwcm->connect = hvnd_connect; - dev->ibdev.iwcm->accept = hvnd_accept_cr; - dev->ibdev.iwcm->reject = hvnd_reject_cr; - dev->ibdev.iwcm->create_listen = hvnd_create_listen; - dev->ibdev.iwcm->destroy_listen = hvnd_destroy_listen; - dev->ibdev.iwcm->add_ref = hvnd_qp_add_ref; - dev->ibdev.iwcm->rem_ref = hvnd_qp_rem_ref; - dev->ibdev.iwcm->get_qp = hvnd_get_qp; - - strlcpy(dev->ibdev.name, "mlx4_%d", IB_DEVICE_NAME_MAX); - ret = ib_register_device(&dev->ibdev, NULL); - if (ret) { - hvnd_error("ib_register_device failed ret=%d\n", ret); - goto bail1; - } - -#if 0 - for (i = 0; i < ARRAY_SIZE(hvnd_class_attributes); ++i) { - ret = device_create_file(&dev->ibdev.dev, - hvnd_class_attributes[i]); - if (ret) { - hvnd_error("device_create_file failed ret=%d\n", ret); - goto bail2; - } - } -#endif - - dev->ib_active = true; - return 0; -//bail2: -// ib_unregister_device(&dev->ibdev); -bail1: - kfree(dev->ibdev.iwcm); - return ret; -} - -void hvnd_unregister_device(struct hvnd_dev *dev) -{ - int i; - - for (i = 0; i < ARRAY_SIZE(hvnd_class_attributes); ++i) - device_remove_file(&dev->ibdev.dev, - hvnd_class_attributes[i]); - ib_unregister_device(&dev->ibdev); - kfree(dev->ibdev.iwcm); - ib_dealloc_device((struct ib_device *)dev); -} - -static int hvnd_try_bind_nic(unsigned char *mac, __be32 ip) -{ - int ret; - struct hvnd_dev *nd_dev = g_nd_dev; - - mutex_lock(&nd_dev->bind_mutex); - if (nd_dev->bind_complete) { - mutex_unlock(&nd_dev->bind_mutex); - return 1; - } - - memcpy(nd_dev->mac_addr, mac, 6); - *(__be32 *)(nd_dev->ip_addr) = ip; - - /* - * Bind the NIC. - */ - hvnd_info("trying to bind to IP %pI4 MAC %pM\n", - nd_dev->ip_addr, nd_dev->mac_addr); - ret = hvnd_bind_nic(nd_dev, false, nd_dev->ip_addr, nd_dev->mac_addr); - if (ret || nd_dev->bind_pkt.pkt_hdr.status) { - mutex_unlock(&nd_dev->bind_mutex); - return 1; - } - - /* if we reach here, this means bind_nic is a success */ - hvnd_error("successfully bound to IP %pI4 MAC %pM\n", - nd_dev->ip_addr, nd_dev->mac_addr); - complete(&nd_dev->addr_set); - nd_dev->bind_complete = 1; - mutex_unlock(&nd_dev->bind_mutex); - - ret = hvnd_register_device(nd_dev, nd_dev->ip_addr, nd_dev->mac_addr); - - if (!ret) - return 0; - - hvnd_error("hvnd_register_device failed ret=%d\n", ret); - - /* roll back all allocated resources on error */ - iounmap(nd_dev->mmio_virt); - release_resource(&nd_dev->mmio_resource); - - vmbus_close(nd_dev->hvdev->channel); - ib_dealloc_device((struct ib_device *)nd_dev); - - return 1; -} - -static void hvnd_inetaddr_event_up(unsigned long event, struct in_ifaddr *ifa) -{ - hvnd_try_bind_nic(ifa->ifa_dev->dev->dev_addr, ifa->ifa_address); -} - -static int hvnd_inetaddr_event(struct notifier_block *notifier, - unsigned long event, void *ptr) -{ - struct in_ifaddr *ifa = ptr; - - switch (event) { - case NETDEV_UP: - hvnd_inetaddr_event_up(event, ifa); - break; - default: - hvnd_debug("Received inetaddr event %lu\n", event); - } - - return NOTIFY_DONE; -} - -static struct notifier_block hvnd_inetaddr_notifier = { - .notifier_call = hvnd_inetaddr_event, -}; - -static int start_bind_nic(void) -{ - struct net_device *dev; - struct in_device *idev; - struct in_ifaddr *ifa; - - register_inetaddr_notifier(&hvnd_inetaddr_notifier); - - rtnl_lock(); - for_each_netdev(&init_net, dev) { - idev = in_dev_get(dev); - if (!idev) - continue; - for (ifa = (idev)->ifa_list; - ifa && !(ifa->ifa_flags&IFA_F_SECONDARY); - ifa = ifa->ifa_next) { - hvnd_try_bind_nic(dev->dev_addr, ifa->ifa_address); - } - } - rtnl_unlock(); - - return 0; -} - -static int hvnd_probe(struct hv_device *dev, - const struct hv_vmbus_device_id *dev_id) -{ - struct hvnd_dev *nd_dev; - int ret = 0; - - hvnd_debug("hvnd starting\n"); - - nd_dev = (struct hvnd_dev *)ib_alloc_device(sizeof(struct hvnd_dev)); - if (!nd_dev) { - ret = -ENOMEM; - goto err_out0; - } - - nd_dev->hvdev = dev; - /* - * We are going to masquerade as MLX4 device; - * Set the vendor and device ID accordingly. - */ - dev->vendor_id = 0x15b3; //Mellanox - dev->device_id = 0x1003; //Mellanox HCA - INIT_LIST_HEAD(&nd_dev->listentry); - spin_lock_init(&nd_dev->uctxt_lk); - nd_dev->ib_active = false; - - /* - * Initialize the state for the id table. - */ - spin_lock_init(&nd_dev->id_lock); - idr_init(&nd_dev->cqidr); - idr_init(&nd_dev->qpidr); - idr_init(&nd_dev->mmidr); - idr_init(&nd_dev->irpidr); - idr_init(&nd_dev->uctxidr); - - atomic_set(&nd_dev->open_cnt, 0); - - sema_init(&nd_dev->query_pkt_sem, 1); - - ret = vmbus_open(dev->channel, HVND_RING_SZ, HVND_RING_SZ, NULL, 0, - hvnd_callback, dev); - - if (ret) { - hvnd_error("vmbus_open failed ret=%d\n", ret); - goto err_out1; - } - - hv_set_drvdata(dev, nd_dev); - - ret = hvnd_negotiate_version(nd_dev); - - if (ret) { - hvnd_error("hvnd_negotiate_version failed ret=%d\n", ret); - goto err_out2; - } - - /* - * Register resources with the host. - */ - ret = hvnd_init_resources(nd_dev); - if (ret) { - hvnd_error("hvnd_init_resources failed ret=%d\n", ret); - goto err_out2; - } - - /* - * Try to bind every NIC to ND channel, - * ND host will only return success for the correct one - */ - nd_dev->bind_complete = 0; - mutex_init(&nd_dev->bind_mutex); - init_completion(&nd_dev->addr_set); - - g_nd_dev = nd_dev; - start_bind_nic(); - - return 0; - -err_out2: - vmbus_close(dev->channel); - -err_out1: - ib_dealloc_device((struct ib_device *)nd_dev); - -err_out0: - return ret; -} - -static int hvnd_remove(struct hv_device *dev) -{ - struct hvnd_dev *nd_dev = hv_get_drvdata(dev); - - unregister_inetaddr_notifier(&hvnd_inetaddr_notifier); - hvnd_bind_nic(nd_dev, true, nd_dev->ip_addr, nd_dev->mac_addr); - hvnd_unregister_device(nd_dev); - vmbus_close(dev->channel); - iounmap(nd_dev->mmio_virt); - release_resource(&nd_dev->mmio_resource); - return 0; -} - -static const struct hv_vmbus_device_id id_table[] = { - /* VMBUS RDMA class guid */ - /* 8c2eaf3d-32a7-4b09-ab99-bd1f1c86b501 */ - { HV_ND_GUID, }, - { }, -}; - -MODULE_DEVICE_TABLE(vmbus, id_table); - -static struct hv_driver hvnd_drv = { - .name = "hv_guest_rdma", - .id_table = id_table, - .probe = hvnd_probe, - .remove = hvnd_remove, -}; - - -static int __init init_hvnd_drv(void) -{ - - pr_info("Registered HyperV networkDirect Driver\n"); - hvnd_addr_init(); - return vmbus_driver_register(&hvnd_drv); -} - -static void exit_hvnd_drv(void) -{ - pr_info("De-Registered HyperV networkDirect Driver\n"); - hvnd_addr_deinit(); - vmbus_driver_unregister(&hvnd_drv); -} - - -module_init(init_hvnd_drv); -module_exit(exit_hvnd_drv); - -MODULE_DESCRIPTION("Hyper-V NetworkDirect Driver"); -MODULE_LICENSE("GPL"); -MODULE_VERSION("4.1.3"); diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/user.h b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/user.h deleted file mode 100644 index 07e6769ef43b..000000000000 --- a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/user.h +++ /dev/null @@ -1,107 +0,0 @@ -/* - * Copyright (c) 2007 Cisco Systems, Inc. All rights reserved. - * Copyright (c) 2007, 2008 Mellanox Technologies. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - */ - -#ifndef MLX4_IB_USER_H -#define MLX4_IB_USER_H - -#include <linux/types.h> - -/* - * Increment this value if any changes that break userspace ABI - * compatibility are made. - */ - -#define MLX4_IB_UVERBS_NO_DEV_CAPS_ABI_VERSION 3 -#define MLX4_IB_UVERBS_ABI_VERSION 4 - -/* - * Make sure that all structs defined in this file remain laid out so - * that they pack the same way on 32-bit and 64-bit architectures (to - * avoid incompatibility between 32-bit userspace and 64-bit kernels). - * In particular do not use pointer types -- pass pointers in __u64 - * instead. - */ - -struct mlx4_ib_alloc_ucontext_resp_v3 { - __u32 qp_tab_size; - __u16 bf_reg_size; - __u16 bf_regs_per_page; -}; - -struct mlx4_ib_alloc_ucontext_resp { - __u32 dev_caps; - __u32 qp_tab_size; - __u16 bf_reg_size; - __u16 bf_regs_per_page; - __u32 cqe_size; -}; - -struct mlx4_ib_alloc_pd_resp { - __u32 pdn; - __u32 reserved; -}; - -struct mlx4_ib_create_cq { - __u64 buf_addr; - __u64 db_addr; -}; - -struct mlx4_ib_create_cq_resp { - __u32 cqn; - __u32 reserved; -}; - -struct mlx4_ib_resize_cq { - __u64 buf_addr; -}; - -struct mlx4_ib_create_srq { - __u64 buf_addr; - __u64 db_addr; -}; - -struct mlx4_ib_create_srq_resp { - __u32 srqn; - __u32 reserved; -}; - -struct mlx4_ib_create_qp { - __u64 buf_addr; - __u64 db_addr; - __u8 log_sq_bb_count; - __u8 log_sq_stride; - __u8 sq_no_prefetch; - __u8 reserved[5]; -}; - -#endif /* MLX4_IB_USER_H */ diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/vmbus_rdma.c b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/vmbus_rdma.c deleted file mode 100644 index daf68dcccb1b..000000000000 --- a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/vmbus_rdma.c +++ /dev/null @@ -1,3074 +0,0 @@ -/* - * Copyright (c) 2014, Microsoft Corporation. - * - * Author: - * K. Y. Srinivasan <kys@microsoft.com> - * - * This program is free software; you can redistribute it and/or modify it - * under the terms of the GNU General Public License version 2 as published - * by the Free Software Foundation. - * - * This program is distributed in the hope that it will be useful, but - * WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or - * NON INFRINGEMENT. See the GNU General Public License for more - * details. - * - * Bug fixes/enhancements: Long Li <longli@microsoft.com> - */ - -#include <linux/completion.h> -#include <linux/module.h> -#include <linux/errno.h> -#include <linux/hyperv.h> -#include <linux/efi.h> -#include <linux/slab.h> -#include <linux/cred.h> -#include <linux/sched.h> -#include <linux/types.h> -#include <linux/scatterlist.h> -#include <rdma/ib_umem.h> -#include <rdma/ib_user_verbs.h> -#include <asm-generic/delay.h> - -#include "vmbus_rdma.h" - -/* - * We only have a single rdma device on the host; - * have a single receive buffer. - */ - - -static char hvnd_recv_buffer[PAGE_SIZE * 4]; - -static atomic_t irp_local_hdl; - -char *hvnd_get_op_name(int ioctl) -{ - switch (ioctl) { - case IOCTL_ND_PROVIDER_INIT: - return "IOCTL_ND_PROVIDER_INIT\n"; - case IOCTL_ND_PROVIDER_BIND_FILE: - return "IOCTL_ND_PROVIDER_BIND_FILE\n"; - case IOCTL_ND_ADAPTER_OPEN: - return "IOCTL_ND_ADAPTER_OPEN\n"; - - case IOCTL_ND_ADAPTER_CLOSE: - return "IOCTL_ND_ADAPTER_CLOSE\n"; - - case IOCTL_ND_ADAPTER_QUERY: - return "IOCTL_ND_ADAPTER_QUERY\n"; - - case IOCTL_ND_PD_CREATE: - return "IOCTL_ND_PD_CREATE\n"; - - case IOCTL_ND_PD_FREE: - return "IOCTL_ND_PD_FREE\n"; - - case IOCTL_ND_CQ_CREATE: - return "IOCTL_ND_CQ_CREATE\n"; - - case IOCTL_ND_CQ_FREE: - return "IOCTL_ND_CQ_FREE\n"; - case IOCTL_ND_CQ_CANCEL_IO: - return "IOCTL_ND_CQ_CANCEL_IO\n"; - case IOCTL_ND_CQ_GET_AFFINITY: - return "IOCTL_ND_CQ_GET_AAFINITY\n"; - case IOCTL_ND_CQ_MODIFY: - return "IOCTL_ND_CQ_MODIFY\n"; - - case IOCTL_ND_CQ_NOTIFY: - return "IOCTL_ND_CQ_NOTIFY\n"; - - - case IOCTL_ND_LISTENER_CREATE: - return "IOCTL_ND_LISTENER_CREATE\n"; - - case IOCTL_ND_LISTENER_FREE: - return "IOCTL_ND_LISTENER_FREE\n"; - - case IOCTL_ND_QP_FREE: - return "IOCTL_ND_QP_FREE\n"; - - case IOCTL_ND_CONNECTOR_CANCEL_IO: - return "IOCTL_ND_CONNECTOR_CANCEL_IO\n"; - - case IOCTL_ND_LISTENER_CANCEL_IO: - return "IOCTL_ND_LISTENER_CANCEL_IO\n"; - - case IOCTL_ND_LISTENER_BIND: - return "IOCTL_ND_LISTENER_BIND\n"; - - case IOCTL_ND_LISTENER_LISTEN: - return "IOCTL_ND_LISTENER_LISTEN\n"; - - case IOCTL_ND_LISTENER_GET_ADDRESS: - return "IOCTL_ND_LISTENER_GET_ADDRESS\n"; - - case IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST: - return "IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST\n"; - - - - case IOCTL_ND_CONNECTOR_CREATE: - return "IOCTL_ND_CONNECTOR_CREATE\n"; - - case IOCTL_ND_CONNECTOR_FREE: - return "IOCTL_ND_CONNECTOR_FREE\n"; - - case IOCTL_ND_CONNECTOR_BIND: - return "IOCTL_ND_CONNECTOR_BIND\n"; - - // KYS: ALERT: ASYNCH Operation - case IOCTL_ND_CONNECTOR_CONNECT: - return "IOCTL_ND_CONNECTOR_CONNECT\n"; - - case IOCTL_ND_CONNECTOR_COMPLETE_CONNECT: - return "IOCTL_ND_CONNECTOR_COMPLETE_CONNECT\n"; - - // KYS: ALERT: ASYNCH Operation - case IOCTL_ND_CONNECTOR_ACCEPT: - return "IOCTL_ND_CONNECTOR_ACCEPT\n"; - - case IOCTL_ND_CONNECTOR_REJECT: - return "IOCTL_ND_CONNECTOR_REJECT\n"; - - case IOCTL_ND_CONNECTOR_GET_READ_LIMITS: - return "IOCTL_ND_CONNECTOR_GET_READ_LIMITS\n"; - - case IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA: - return "IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA\n"; - - case IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS: - return "IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS\n"; - - case IOCTL_ND_CONNECTOR_GET_ADDRESS: - return "IOCTL_ND_CONNECTOR_GET_ADDRESS\n"; - - // KYS: ALERT: ASYNCH Operation - case IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT: - return "IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT\n"; - - // KYS: ALERT: ASYNCH Operation - case IOCTL_ND_CONNECTOR_DISCONNECT: - return "IOCTL_ND_CONNECTOR_DISCONNECT\n"; - - - - case IOCTL_ND_QP_CREATE: - return "IOCTL_ND_QP_CREATE\n"; - - case IOCTL_ND_MR_CREATE: - return "IOCTL_ND_MR_CREATE\n"; - - case IOCTL_ND_MR_FREE: - return "IOCTL_ND_MR_FREE\n"; - case IOCTL_ND_MR_REGISTER: - return "IOCTL_ND_MR_REGISTER\n"; - case IOCTL_ND_MR_DEREGISTER: - return "IOCTL_ND_MR_DEREGISTER\n"; - case IOCTL_ND_MR_CANCEL_IO: - return "IOCTL_ND_MR_CANCEL_IO\n"; - case IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST: - return "IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST\n"; - case IOCTL_ND_QP_FLUSH: - return "IOCTL_ND_QP_FLUSH\n"; - - default: - return "Unknown IOCTL\n"; - } -} -int get_irp_handle(struct hvnd_dev *nd_dev, u32 *local, void *irp_ctx) -{ - unsigned int local_handle; - int ret; - - local_handle = atomic_inc_return(&irp_local_hdl); - *local = local_handle; - - /* - * Now asssociate the local handle with the pointer. - */ - ret = insert_handle(nd_dev, &nd_dev->irpidr, irp_ctx, local_handle); - hvnd_debug("irp_ctx=%p local_handle=%u\n", irp_ctx, local_handle); - - if (ret) { - hvnd_error("insert_handle failed ret=%d\n", ret); - return ret; - } - - return 0; -} - -void put_irp_handle(struct hvnd_dev *nd_dev, u32 irp) -{ - remove_handle(nd_dev, &nd_dev->irpidr, irp); - -} - -static void init_pfn(u64 *pfn, void *addr, u32 length) -{ - int i; - u32 offset = offset_in_page(addr); - u32 num_pfn = DIV_ROUND_UP(offset + length, PAGE_SIZE); - - for (i = 0; i < num_pfn; i++) - pfn[i] = virt_to_phys((u8 *)addr + - (PAGE_SIZE * i)) >> PAGE_SHIFT; -} - - -static void user_va_init_pfn(u64 *pfn, struct ib_umem *umem) -{ - int entry; - struct scatterlist *sg; - int i = 0; - - for_each_sg(umem->sg_head.sgl, sg, umem->nmap, entry) { - pfn[i++] = page_to_pfn(sg_page(sg)); - } -} - -static u32 get_local_handle(void *p) -{ - u64 val = (unsigned long)p; - - return (u32)val; -} - -static int hvnd_send_pg_buffer(struct hvnd_dev *nd_dev, - struct vmbus_packet_mpb_array *desc, - u32 desc_size, - void *buffer, - u32 bufferlen, u64 cookie) -{ - int ret; - int t; - struct hvnd_cookie hvnd_cookie; - - hvnd_cookie.pkt = (void *)cookie; - init_completion(&hvnd_cookie.host_event); - - ret = vmbus_sendpacket_mpb_desc(nd_dev->hvdev->channel, - desc, - desc_size, - buffer, bufferlen, - (u64)(&hvnd_cookie)); - - if (ret) { - hvnd_error("vmbus_sendpacket_mpb_desc failed ret=%d\n", ret); - goto err; - } - - t = wait_for_completion_timeout(&hvnd_cookie.host_event, 500*HZ); - - if (t == 0) { - hvnd_error("wait_for_completion_timeout timed out\n"); - ret = -ETIMEDOUT; - } - -err: - return ret; -} - -static int hvnd_send_packet(struct hvnd_dev *nd_dev, void *buffer, - u32 bufferlen, u64 cookie, bool block) -{ - int ret; - int t; - struct hvnd_cookie hvnd_cookie; - - hvnd_cookie.pkt = (void *)cookie; - init_completion(&hvnd_cookie.host_event); - - ret = vmbus_sendpacket(nd_dev->hvdev->channel, buffer, bufferlen, - (u64)(&hvnd_cookie), VM_PKT_DATA_INBAND, - VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED); - - if (ret) { - hvnd_error("vmbus_send pkt failed: %d\n", ret); - goto err; - } - - if (!block) - return ret; - - t = wait_for_completion_timeout(&hvnd_cookie.host_event, 500*HZ); - - if (t == 0) { - hvnd_error("wait_for_completion_timeout timed out\n"); - ret = -ETIMEDOUT; - } - -err: - return ret; -} - -static int hvnd_send_pgbuf_ioctl_pkt(struct hvnd_dev *nd_dev, - struct vmbus_packet_mpb_array *desc, - u32 desc_size, - struct ndv_packet_hdr_control_1 *hdr, - u32 pkt_size, u64 cookie) -{ - int ret; - int ioctl; - - ioctl = hdr->io_cntrl_code; - - - ret = hvnd_send_pg_buffer(nd_dev, desc, desc_size, - hdr, pkt_size, cookie); - - if (ret) - return ret; - - if (hdr->pkt_hdr.status != 0) { - hvnd_error("IOCTL: %s failed; status is %x\n", - hvnd_get_op_name(ioctl), - hdr->pkt_hdr.status); - return -EINVAL; - } - - switch (hdr->io_status) { - case STATUS_SUCCESS: - case STATUS_PENDING: - return 0; - - default: - hvnd_error("IOCTL: %s failed io status is %x\n", - hvnd_get_op_name(ioctl), hdr->io_status); - return -EINVAL; - } -} - -int hvnd_send_ioctl_pkt(struct hvnd_dev *nd_dev, - struct ndv_packet_hdr_control_1 *hdr, - u32 pkt_size, u64 cookie) -{ - int ret; - int ioctl; - bool block; - - block = (hdr->irp_handle.val64 == 0) ? true : false; - - - ioctl = hdr->io_cntrl_code; - - ret = hvnd_send_packet(nd_dev, hdr, pkt_size, cookie, block); - - if (ret) - return ret; - - if (!block) - return ret; - - if (hdr->pkt_hdr.status != 0) { - hvnd_error("IOCTL: %s failed; status is %x\n", - hvnd_get_op_name(ioctl), hdr->pkt_hdr.status); - return -EINVAL; - } - - switch (hdr->io_status) { - case STATUS_SUCCESS: - case STATUS_PENDING: - return 0; - - default: - hvnd_warn("IOCTL: %s failed io status is %x\n", - hvnd_get_op_name(ioctl), hdr->io_status); - return -EINVAL; - } -} - -void hvnd_init_hdr(struct ndv_packet_hdr_control_1 *hdr, - u32 data_sz, u32 local, u32 remote, - u32 ioctl_code, - u32 ext_data_sz, u32 ext_data_offset, - u64 irp_handle) - -{ - int pkt_type; - - pkt_type = NDV_PKT_ID1_CONTROL; - NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); - hdr->pkt_hdr.packet_type = pkt_type; - hdr->pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); - hdr->pkt_hdr.data_sz = data_sz; - - hdr->pkt_hdr.status = 0; - - hdr->file_handle.local = local; - hdr->file_handle.remote = remote; - hdr->irp_handle.val64 = irp_handle; - - hdr->io_cntrl_code = ioctl_code; - hdr->output_buf_sz = data_sz - ext_data_sz; - hdr->input_buf_sz = data_sz - ext_data_sz; - - hdr->input_output_buf_offset = 0; - - hdr->extended_data.size = ext_data_sz; - hdr->extended_data.offset = ext_data_offset; -} - - -int hvnd_create_file(struct hvnd_dev *nd_dev, void *uctx, - struct ndv_pkt_hdr_create_1 *create, u32 file_flags) -{ - int ret; - int pkt_type; - - - pkt_type = NDV_PKT_ID1_CREATE; - NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); - create->pkt_hdr.packet_type = pkt_type; - create->pkt_hdr.hdr_sz = sizeof(struct ndv_pkt_hdr_create_1); - create->pkt_hdr.data_sz = 0; - - create->handle.local = get_local_handle(uctx); - create->access_mask = STANDARD_RIGHTS_ALL; - create->open_options = OPEN_EXISTING; - create->file_attributes = FILE_ATTRIBUTE_NORMAL | file_flags; - create->share_access = FILE_SHARE_ALL; - - ret = hvnd_send_packet(nd_dev, create, - sizeof(struct ndv_pkt_hdr_create_1), - (unsigned long)create, true); - return ret; -} - -int hvnd_cleanup_file(struct hvnd_dev *nd_dev, u32 local, u32 remote) -{ - int ret; - int pkt_type; - struct ndv_pkt_hdr_cleanup_1 cleanup_pkt; - - - pkt_type = NDV_PKT_ID1_CLEANUP; - NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); - - cleanup_pkt.pkt_hdr.packet_type = pkt_type; - cleanup_pkt.pkt_hdr.hdr_sz = sizeof(struct ndv_pkt_hdr_create_1); - cleanup_pkt.pkt_hdr.data_sz = 0; - - cleanup_pkt.handle.local = local; - cleanup_pkt.handle.remote = remote; - - ret = hvnd_send_packet(nd_dev, &cleanup_pkt, - sizeof(struct ndv_pkt_hdr_create_1), - (unsigned long)&cleanup_pkt, true); - return ret; -} - - -static int hvnd_do_ioctl(struct hvnd_dev *nd_dev, u32 ioctl, - struct pkt_nd_provider_ioctl *pkt, - union ndv_context_handle *hdr_handle, - struct nd_handle *ioctl_handle, - u8 *buf, u32 buf_len, bool c_in, bool c_out, u64 irp_val) -{ - int ret; - int pkt_type; - - pkt_type = NDV_PKT_ID1_CONTROL; - NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); - - pkt->hdr.pkt_hdr.packet_type = pkt_type; - pkt->hdr.pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); - pkt->hdr.pkt_hdr.data_sz = (sizeof(struct pkt_nd_provider_ioctl) - - sizeof(struct ndv_packet_hdr_control_1)); - - pkt->hdr.file_handle.local = hdr_handle->local; - pkt->hdr.file_handle.remote = hdr_handle->remote; - hvnd_debug("create handle local: %x remote: %x\n", - hdr_handle->local, hdr_handle->remote); - - pkt->hdr.irp_handle.val64 = irp_val; - - pkt->hdr.io_cntrl_code = ioctl; - pkt->hdr.output_buf_sz = sizeof(struct nd_ioctl); - pkt->hdr.input_buf_sz = sizeof(struct nd_ioctl); - pkt->hdr.input_output_buf_offset = 0; - memset(&pkt->ioctl.handle, 0, sizeof(struct nd_handle)); - pkt->ioctl.handle.version = ND_VERSION_1; - - switch (ioctl) { - case IOCTL_ND_PROVIDER_BIND_FILE: - pkt->ioctl.handle.handle = ioctl_handle->handle; - break; - default: - break; - }; - - /* - * Copy the input buffer, if needed. - */ - - if (c_in && (buf != NULL)) - memcpy(pkt->ioctl.raw_buffer, buf, buf_len); - - ret = hvnd_send_packet(nd_dev, pkt, - sizeof(struct pkt_nd_provider_ioctl), - (unsigned long)pkt, true); - - if (ret) - return ret; - - if (c_out && (buf != NULL)) - memcpy(buf, pkt->ioctl.raw_buffer, buf_len); - - return ret; -} - -static int idr_callback(int id, void *p, void *data) -{ - if (p == data) - return id; - return 0; -} - -void remove_uctx(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx) -{ - int pid = current_pid(); - unsigned long flags; - int id; - - if (get_uctx(nd_dev, pid) == uctx) - remove_handle(nd_dev, &nd_dev->uctxidr, pid); - else { - hvnd_warn("uctx %p not found on pid %d, doing a idr search\n", - uctx, current_pid()); - - spin_lock_irqsave(&nd_dev->id_lock, flags); - id = idr_for_each(&nd_dev->uctxidr, idr_callback, uctx); - spin_unlock_irqrestore(&nd_dev->id_lock, flags); - - if (id) - remove_handle(nd_dev, &nd_dev->uctxidr, id); - else { - hvnd_error("uctx %p not found in idr table\n", uctx); - return; - } - } - - kfree(uctx); -} - -int hvnd_close_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx) -{ - int ret; - - /* - * First close the adaptor. - */ - - ret = hvnd_free_handle(nd_dev, uctx, - uctx->adaptor_hdl, - IOCTL_ND_ADAPTER_CLOSE); - - if (ret) - hvnd_error("Adaptor close failed; ret is %x\n", ret); - - /* - * Now close the two files we created. - */ - - ret = hvnd_cleanup_file(nd_dev, uctx->file_handle_ovl.local, - uctx->file_handle_ovl.remote); - - if (ret) - hvnd_error("file cleanup failed; ret is %x\n", ret); - - ret = hvnd_cleanup_file(nd_dev, uctx->file_handle.local, - uctx->file_handle.remote); - - if (ret) - hvnd_error("File cleanup failed; ret is %x\n", ret); - - /* - * Remove the uctx from the ID table. - */ - remove_uctx(nd_dev, uctx); - - return 0; -} - -int hvnd_open_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx) -{ - int ret; - struct pkt_nd_provider_ioctl *pr_init = &uctx->pr_init_pkt; - int pkt_type; - struct nd_handle ioctl_handle; - struct pkt_nd_open_adapter *pr_o_adap = &uctx->o_adap_pkt; - - ret = hvnd_create_file(nd_dev, uctx, &uctx->create_pkt, 0); - if (ret) { - hvnd_error("hvnd_create_file failed ret=%d\n", ret); - goto error_cr; - } - - if (uctx->create_pkt.pkt_hdr.status != 0) { - hvnd_error("create File failed; status is %d\n", - uctx->create_pkt.pkt_hdr.status); - ret = -EINVAL; - goto error_cr; - } - - uctx->file_handle.local = uctx->create_pkt.handle.local; - uctx->file_handle.remote = uctx->create_pkt.handle.remote; - hvnd_debug("INITIALIZE PROVIDER\n"); - /* - * Now Initialize the Provider. - */ - ioctl_handle.handle = 0; - ret = hvnd_do_ioctl(nd_dev, IOCTL_ND_PROVIDER_INIT, pr_init, - &uctx->create_pkt.handle, - &ioctl_handle, NULL, 0, false, false, 0); - - if (ret) { - ret = -EINVAL; - goto error_pr_init; - } - - if (pr_init->hdr.pkt_hdr.status != 0) { - hvnd_error("Provider INIT failed; status is %d\n", - pr_init->hdr.pkt_hdr.status); - ret = -EINVAL; - goto error_pr_init; - } - - if (pr_init->hdr.io_status != 0) { - hvnd_error("Provider INIT failed; io status is %d\n", - pr_init->hdr.io_status); - ret = -EINVAL; - goto error_pr_init; - } - - /* - * Now create the overlap file. - */ - - hvnd_debug("CREATE OVERLAP FILE\n"); - ret = hvnd_create_file(nd_dev, uctx, &uctx->create_pkt_ovl, - FILE_FLAG_OVERLAPPED); - if (ret) { - hvnd_error("hvnd_create_file failed ret=%d\n", ret); - goto error_pr_init; - } - - if (uctx->create_pkt_ovl.pkt_hdr.status != 0) { - hvnd_error("create Overlap File failed; status is %d\n", - uctx->create_pkt_ovl.pkt_hdr.status); - ret = -EINVAL; - goto error_pr_init; - } - uctx->file_handle_ovl.local = uctx->create_pkt_ovl.handle.local; - uctx->file_handle_ovl.remote = uctx->create_pkt_ovl.handle.remote; - - /* - * Now bind the two file handles together. - */ - - hvnd_debug("BIND FILE IOCTL remote handle: %d local handle: %d\n", - uctx->create_pkt_ovl.handle.remote, - uctx->create_pkt_ovl.handle.local); - - ioctl_handle.handle = uctx->create_pkt_ovl.handle.val64; - ret = hvnd_do_ioctl(nd_dev, IOCTL_ND_PROVIDER_BIND_FILE, pr_init, - &uctx->create_pkt.handle, - &ioctl_handle, NULL, 0, false, false, 0); - - if (ret) { - ret = -EINVAL; - goto error_file_bind; - } - if (pr_init->hdr.pkt_hdr.status != 0) { - hvnd_error("Provider File bind failed; status is %d\n", - pr_init->hdr.pkt_hdr.status); - ret = -EINVAL; - goto error_file_bind; - } - if (pr_init->hdr.io_status != 0) { - hvnd_error("Provider INIT failed; io status is %d\n", - pr_init->hdr.io_status); - ret = -EINVAL; - goto error_file_bind; - } - - /* - * Now open the adaptor. - */ - - hvnd_debug("OPENING THE ADAPTOR\n"); - - pkt_type = NDV_PKT_ID1_CONTROL; - NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); - pr_o_adap->hdr.pkt_hdr.packet_type = pkt_type; - pr_o_adap->hdr.pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); - pr_o_adap->hdr.pkt_hdr.data_sz = sizeof(struct pkt_nd_open_adapter) - - sizeof(struct ndv_packet_hdr_control_1); - - pr_o_adap->hdr.pkt_hdr.status = 0; - - hvnd_debug("hdr sz is %d\n", pr_o_adap->hdr.pkt_hdr.hdr_sz); - hvnd_debug("data sz is %d\n", pr_o_adap->hdr.pkt_hdr.data_sz); - - pr_o_adap->hdr.file_handle.local = uctx->create_pkt.handle.local; - pr_o_adap->hdr.file_handle.remote = uctx->create_pkt.handle.remote; - hvnd_debug("create handle local is %x\n", - uctx->create_pkt.handle.local); - hvnd_debug("create handle remote is %x\n", - uctx->create_pkt.handle.remote); - pr_o_adap->hdr.irp_handle.val64 = 0; - - pr_o_adap->hdr.io_cntrl_code = IOCTL_ND_ADAPTER_OPEN; - pr_o_adap->hdr.output_buf_sz = pr_o_adap->hdr.pkt_hdr.data_sz - - sizeof(struct extended_data_oad); - pr_o_adap->hdr.input_buf_sz = pr_o_adap->hdr.pkt_hdr.data_sz - - sizeof(struct extended_data_oad); - - hvnd_debug("output buf sz is %d\n", pr_o_adap->hdr.output_buf_sz); - hvnd_debug("input buf sz is %d\n", pr_o_adap->hdr.input_buf_sz); - hvnd_debug("packet size is %d\n", - (int)sizeof(struct pkt_nd_open_adapter)); - - pr_o_adap->hdr.input_output_buf_offset = 0; - - - pr_o_adap->hdr.extended_data.size = sizeof(struct extended_data_oad); - pr_o_adap->hdr.extended_data.offset = - offsetof(struct pkt_nd_open_adapter, ext_data) - - sizeof(struct ndv_packet_hdr_control_1); - - hvnd_debug("size of the extended data size: %d\n", - (int)sizeof(struct extended_data_oad)); - hvnd_debug("offset of extended data: %d\n", - pr_o_adap->hdr.extended_data.offset); - - /* - * Now fill out the ioctl section. - */ - - pr_o_adap->ioctl.input.version = ND_VERSION_1; - pr_o_adap->ioctl.input.ce_mapping_cnt = - RTL_NUMBER_OF(pr_o_adap->mappings.ctx_input.mappings); - - hvnd_debug("ce_mapping cnt is %d\n", - pr_o_adap->ioctl.input.ce_mapping_cnt); - - pr_o_adap->ioctl.input.cb_mapping_offset = sizeof(union oad_ioctl); - hvnd_debug("cb_mapping offset is %d\n", - pr_o_adap->ioctl.input.cb_mapping_offset); - pr_o_adap->ioctl.input.adapter_id = (u64)nd_dev; - - pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_UAR].map_type = ND_MAP_IOSPACE; - pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_UAR].map_io_space.cache_type = ND_NON_CACHED; - pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_UAR].map_io_space.cb_length = 4096; - - pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_BF].map_type = ND_MAP_IOSPACE; - pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_BF].map_io_space.cache_type = ND_WRITE_COMBINED; - pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_BF].map_io_space.cb_length = 4096; - - /* - * Fill in the extended data. - */ - pr_o_adap->ext_data.cnt = IBV_GET_CONTEXT_MAPPING_MAX; - - ret = hvnd_send_packet(nd_dev, pr_o_adap, - sizeof(struct pkt_nd_open_adapter), - (unsigned long)pr_o_adap, true); - if (ret) { - ret = -EINVAL; - goto error_file_bind; - } - - if (pr_o_adap->hdr.pkt_hdr.status != 0) { - hvnd_error("Open adaptor failed; status is %d\n", - pr_o_adap->hdr.pkt_hdr.status); - ret = -EINVAL; - goto error_file_bind; - } - - if (pr_o_adap->hdr.io_status != 0) { - hvnd_error("Open adaptor failed;io status is %d\n", - pr_o_adap->hdr.io_status); - ret = -EINVAL; - goto error_file_bind; - } - - /* - * Copy the necessary response from the host. - */ - - uctx->adaptor_hdl = pr_o_adap->ioctl.resrc_desc.handle; - - - hvnd_debug("adaptor handle: %p\n", (void *)uctx->adaptor_hdl); - - uctx->uar_base = - pr_o_adap->mappings.ctx_output.mapping_results[IBV_GET_CONTEXT_UAR].info; - hvnd_debug("uar base: %p\n", (void *)uctx->uar_base); - - uctx->bf_base = - pr_o_adap->mappings.ctx_output.mapping_results[IBV_GET_CONTEXT_BF].info; - hvnd_debug("bf base: %p\n", (void *)uctx->bf_base); - - uctx->bf_buf_size = - pr_o_adap->mappings.ctx_output.bf_buf_size; - hvnd_debug("bf buf size: %d\n", uctx->bf_buf_size); - - uctx->bf_offset = - pr_o_adap->mappings.ctx_output.bf_offset; - hvnd_debug("bf offset: %d\n", uctx->bf_offset); - - uctx->cqe_size = - pr_o_adap->mappings.ctx_output.cqe_size; - hvnd_debug("cqe size: %d\n", uctx->cqe_size); - - uctx->max_qp_wr = - pr_o_adap->mappings.ctx_output.max_qp_wr; - hvnd_debug("max qp wr: %d\n", uctx->max_qp_wr); - - uctx->max_sge = - pr_o_adap->mappings.ctx_output.max_sge; - hvnd_debug("max sge: %d\n", uctx->max_sge); - - uctx->max_cqe = - pr_o_adap->mappings.ctx_output.max_cqe; - hvnd_debug("max cqe: %d\n", uctx->max_cqe); - - uctx->num_qps = - pr_o_adap->mappings.ctx_output.qp_tab_size; - hvnd_debug("num qps: %d\n", uctx->num_qps); - - /* - * Now query the adaptor and stash away the adaptor info. - */ - - ret = hvnd_query_adaptor(nd_dev, uctx); - if (ret) { - hvnd_error("Query Adaptor failed; ret is %d\n", ret); - goto query_err; - } - - return ret; - -query_err: - hvnd_free_handle(nd_dev, uctx, - uctx->adaptor_hdl, - IOCTL_ND_ADAPTER_CLOSE); - - hvnd_error("Open Adaptor Failed!!\n"); - -error_file_bind: - hvnd_cleanup_file(nd_dev, uctx->file_handle_ovl.local, - uctx->file_handle_ovl.remote); - -error_pr_init: - hvnd_cleanup_file(nd_dev, uctx->file_handle.local, - uctx->file_handle.remote); - -error_cr: - if (get_uctx(nd_dev, current_pid()) != NULL) - remove_handle(nd_dev, &nd_dev->uctxidr, current_pid()); - - return ret; -} - -int hvnd_create_cq(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - struct hvnd_cq *cq) -{ - int ret; - struct pkt_nd_create_cq *pkt; - int num_pfn, num_db_pfn, num_sn_pfn; - int cq_pkt_size; - unsigned int cq_buf_size, offset; - u32 ext_data_sz; - u32 ext_data_offset; - - /* - * Now create CQ. - * First compute the number of PFNs we need to accomodate: - * One each for door bell and arm_sn and pages in cq buffer. - */ - cq_buf_size = (cq->entries * uctx->cqe_size); - offset = offset_in_page(cq->cq_buf); - num_pfn = DIV_ROUND_UP(offset + cq_buf_size, PAGE_SIZE); - - offset = offset_in_page(cq->db_addr); - num_db_pfn = DIV_ROUND_UP(offset + 8, PAGE_SIZE); - - offset = offset_in_page(&cq->arm_sn); - num_sn_pfn = DIV_ROUND_UP(offset + 4, PAGE_SIZE); - - cq_pkt_size = sizeof(struct pkt_nd_create_cq) + - (num_pfn * sizeof(u64)); - - ext_data_sz = sizeof(struct create_cq_ext_data) + - (num_pfn * sizeof(u64)); - ext_data_offset = offsetof(struct pkt_nd_create_cq, ext_data) - - sizeof(struct ndv_packet_hdr_control_1); - - hvnd_debug("CREATE CQ, num user addr pfns is %d\n", num_pfn); - hvnd_debug("CREATE CQ, num db pfns is %d\n", num_db_pfn); - - pkt = kzalloc(cq_pkt_size, GFP_KERNEL); - - if (!pkt) - return -ENOMEM; - - hvnd_init_hdr(&pkt->hdr, - (cq_pkt_size - - sizeof(struct ndv_packet_hdr_control_1)), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_CQ_CREATE, - ext_data_sz, - ext_data_offset, - 0); - - /* - * Now fill out the ioctl section. - */ - - pkt->ioctl.input.version = ND_VERSION_1; - pkt->ioctl.input.queue_depth = cq->entries; - pkt->ioctl.input.ce_mapping_cnt = MLX4_IB_CREATE_CQ_MAPPING_MAX; - pkt->ioctl.input.cb_mapping_offset = sizeof(union create_cq_ioctl); - - hvnd_debug("ce_mapping cnt is %d\n", pkt->ioctl.input.ce_mapping_cnt); - hvnd_debug("cb_mapping offset is %d\n", - pkt->ioctl.input.cb_mapping_offset); - - pkt->ioctl.input.adapter_handle = uctx->adaptor_hdl; - pkt->ioctl.input.affinity.mask = 0; - pkt->ioctl.input.affinity.group = -1; - - // 0 for usermode CQ arming - pkt->mappings.cq_in.flags = 0; - - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_BUF].map_memory.map_type = ND_MAP_MEMORY; - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_BUF].map_memory.access_type = ND_MODIFY_ACCESS; - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_BUF].map_memory.address = (u64)cq->cq_buf; - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_BUF].map_memory.cb_length = (cq->entries * uctx->cqe_size); - - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_DB].map_memory.map_type = ND_MAP_MEMORY_COALLESCE; - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_DB].map_memory.access_type = ND_WRITE_ACCESS; - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_DB].map_memory.address = (u64)cq->db_addr; - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_DB].map_memory.cb_length = 8; //size of two ints - - - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_ARM_SN].map_memory.map_type = ND_MAP_MEMORY; - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_ARM_SN].map_memory.access_type = ND_MODIFY_ACCESS; - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_ARM_SN].map_memory.address = (u64)&cq->arm_sn; - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_ARM_SN].map_memory.cb_length = 4; //size of one int - /* - * Fill in the extended data. - */ - - pkt->ext_data.cnt = 3; - pkt->ext_data.fields[MLX4_IB_CREATE_CQ_BUF].size = (sizeof(struct gpa_range) + (num_pfn * sizeof(u64))); - pkt->ext_data.fields[MLX4_IB_CREATE_CQ_BUF].offset = offsetof(struct create_cq_ext_data, cqbuf_gpa); - - pkt->ext_data.fields[MLX4_IB_CREATE_CQ_DB].size = sizeof(struct cq_db_gpa); - pkt->ext_data.fields[MLX4_IB_CREATE_CQ_DB].offset = offsetof(struct create_cq_ext_data, db_gpa); - - pkt->ext_data.fields[MLX4_IB_CREATE_CQ_ARM_SN].size = sizeof(struct cq_db_gpa); - pkt->ext_data.fields[MLX4_IB_CREATE_CQ_ARM_SN].offset = offsetof(struct create_cq_ext_data, sn_gpa); - - /* - * Fill up the gpa range for cq buffer. - */ - - pkt->ext_data.db_gpa.byte_count = 8; - pkt->ext_data.db_gpa.byte_offset = offset_in_page(cq->db_addr); - user_va_init_pfn(&pkt->ext_data.db_gpa.pfn_array[0], cq->db_umem); - - pkt->ext_data.sn_gpa.byte_count = 4; - pkt->ext_data.sn_gpa.byte_offset = offset_in_page(&cq->arm_sn); - init_pfn(&pkt->ext_data.sn_gpa.pfn_array[0], - &cq->arm_sn, - 4); - - pkt->ext_data.cqbuf_gpa.byte_count = (cq->entries * uctx->cqe_size); - pkt->ext_data.cqbuf_gpa.byte_offset = offset_in_page(cq->cq_buf); - user_va_init_pfn(&pkt->ext_data.cqbuf_gpa.pfn_array[0], cq->umem); - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt->hdr, cq_pkt_size, (u64)pkt); - - if (ret) - goto cr_cq_err; - - /* - * Copy the necessary response from the host. - */ - cq->cqn = pkt->mappings.cq_resp.cqn; - cq->cqe = pkt->mappings.cq_resp.cqe; - cq->cq_handle = pkt->ioctl.resrc_desc.handle; - - ret = insert_handle(nd_dev, &nd_dev->cqidr, cq, cq->cqn); - - if (ret) - goto cr_cq_err; - hvnd_debug("CQ create after success cqn is %d\n", cq->cqn); - hvnd_debug("CQ create after success cqe is %d\n", cq->cqe); - hvnd_debug("CQ create after success cq handle is %p\n", - (void *)cq->cq_handle); - -cr_cq_err: - kfree(pkt); - return ret; -} - -int hvnd_destroy_cq(struct hvnd_dev *nd_dev, struct hvnd_cq *cq) -{ - struct pkt_nd_free_cq free_cq_pkt; - int ret; - - // KYS try to avoid having to zero everything - memset(&free_cq_pkt, 0, sizeof(free_cq_pkt)); - hvnd_init_hdr(&free_cq_pkt.hdr, - sizeof(struct pkt_nd_free_cq) - - sizeof(struct ndv_packet_hdr_control_1), - cq->uctx->create_pkt.handle.local, - cq->uctx->create_pkt.handle.remote, - IOCTL_ND_CQ_FREE, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - free_cq_pkt.ioctl.in.version = ND_VERSION_1; - free_cq_pkt.ioctl.in.handle = cq->cq_handle; - - ret = hvnd_send_ioctl_pkt(nd_dev, &free_cq_pkt.hdr, - sizeof(struct pkt_nd_free_cq), - (u64)&free_cq_pkt); - - if (ret) - goto free_cq_err; - - remove_handle(nd_dev, &nd_dev->cqidr, cq->cqn); - - return 0; - -free_cq_err: - return ret; -} - - -int hvnd_notify_cq(struct hvnd_dev *nd_dev, struct hvnd_cq *cq, - u32 notify_type, u64 irp_handle) -{ - struct pkt_nd_notify_cq notify_cq_pkt; - int ret; - union ndv_context_handle irp_fhandle; - - irp_fhandle.local = cq->ep_object.local_irp; - - // KYS try to avoid having to zero everything - memset(¬ify_cq_pkt, 0, sizeof(notify_cq_pkt)); - hvnd_init_hdr(¬ify_cq_pkt.hdr, - sizeof(struct pkt_nd_notify_cq) - - sizeof(struct ndv_packet_hdr_control_1), - cq->uctx->create_pkt.handle.local, - cq->uctx->create_pkt.handle.remote, - IOCTL_ND_CQ_NOTIFY, 0, 0, irp_fhandle.val64); - - /* - * Now fill in the ioctl section. - */ - notify_cq_pkt.ioctl.in.version = ND_VERSION_1; - notify_cq_pkt.ioctl.in.cq_handle = cq->cq_handle; - notify_cq_pkt.ioctl.in.type = notify_type; - - - ret = hvnd_send_ioctl_pkt(nd_dev, ¬ify_cq_pkt.hdr, - sizeof(struct pkt_nd_notify_cq), - (u64)¬ify_cq_pkt); - - return ret; -} - -/* - * Memory region operations. - */ -int hvnd_cr_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 pd_handle, u64 *mr_handle) -{ - struct pkt_nd_create_mr pkt; - int ret; - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_MR_CREATE, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.version = ND_VERSION_1; - pkt.ioctl.in.handle = pd_handle; - hvnd_debug("PD handle is %p\n", (void *)pd_handle); - pkt.ioctl.in.reserved = 0; - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - /* - * Copy the handle. - */ - hvnd_debug("mr handle is %p\n", (void *)pkt.ioctl.out); - *mr_handle = pkt.ioctl.out; - - return 0; - -err: - hvnd_error("create mr failed: %d\n", ret); - return ret; - -} - -int hvnd_free_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 handle) -{ - return hvnd_free_handle(nd_dev, uctx, handle, IOCTL_ND_MR_FREE); -} - -int hvnd_deregister_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 handle) -{ - struct pkt_nd_deregister_mr pkt; - int ret; - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_MR_DEREGISTER, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.version = ND_VERSION_1; - pkt.ioctl.in.handle = handle; - pkt.ioctl.in.reserved = 0; - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - return 0; - -err: - hvnd_error("de-register mr failed: %d\n", ret); - return ret; - -} - -static inline u32 hvnd_convert_access(int acc) -{ - return (acc & IB_ACCESS_REMOTE_WRITE ? ND_MR_FLAG_ALLOW_REMOTE_WRITE : 0) | - (acc & IB_ACCESS_REMOTE_READ ? ND_MR_FLAG_ALLOW_REMOTE_READ : 0) | - (acc & IB_ACCESS_LOCAL_WRITE ? ND_MR_FLAG_ALLOW_LOCAL_WRITE : 0); -} - - -int hvnd_mr_register(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - struct hvnd_mr *mr) -{ - struct pkt_nd_register_mr pkt; - int ret; - struct hv_mpb_array *pb; - struct vmbus_packet_mpb_array *tpb; - int sz_leaf; - int num_pgs; - int i = 0; - int ext_data_sz; - u32 acc_flags; - u32 desc_size; - int pkt_type; - - /* - * The user address is passed in via a two level structure. - * An Array of struct hv_page_buffer will be used to describe - * the user memory. The pages containing this array will be descibed - * in another array of struct hv_page_buffer. We pass this seconed level - * array to the host. - */ - - hvnd_debug("ib_umem_page_count(mr->umem)=%d\n", - ib_umem_page_count(mr->umem)); - - sz_leaf = ib_umem_page_count(mr->umem) * sizeof(u64) + - sizeof(struct hv_mpb_array); - - pb = (struct hv_mpb_array *) __get_free_pages(GFP_KERNEL|__GFP_ZERO, - get_order(sz_leaf)); - - if (pb == NULL) - return -ENOMEM; - /* - * Allocate an array of hv_page_buffer to describe the first level. - */ - num_pgs = DIV_ROUND_UP(sz_leaf, PAGE_SIZE); - hvnd_debug("num pages in the top array is %d\n", num_pgs); - - desc_size = (num_pgs * sizeof(u64) + - sizeof(struct vmbus_packet_mpb_array)); - tpb = (struct vmbus_packet_mpb_array *) - __get_free_pages(GFP_KERNEL|__GFP_ZERO, get_order(desc_size)); - - if (tpb == NULL) { - free_pages((unsigned long)pb, get_order(sz_leaf)); - return -ENOMEM; - } - - hvnd_debug("sz leaf: %d; pgs in top %d\n", sz_leaf, num_pgs); - - /* - * Now fill the leaf level array. - */ - pb->len = mr->length; - pb->offset = offset_in_page(mr->start); - user_va_init_pfn(pb->pfn_array, mr->umem); - - /* - * Now fill out the top level array. - */ - for (i = 0; i < num_pgs; i++) { - tpb->range.pfn_array[i] = virt_to_phys( - (u8 *)pb + (PAGE_SIZE * i)) >> PAGE_SHIFT; - hvnd_debug("virtual address = %p\n", - (u8 *)pb + (PAGE_SIZE * i)); - hvnd_debug("physical address = %llx\n", - virt_to_phys((u8 *)pb + (PAGE_SIZE * i))); - hvnd_debug("tpb->range.pfn_array[%d]=%llx\n", i, - tpb->range.pfn_array[i]); - } - - tpb->range.offset = 8; - tpb->range.len = ib_umem_page_count(mr->umem) * sizeof(u64); - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - ext_data_sz = (ib_umem_page_count(mr->umem) * sizeof(u64)); - acc_flags = ND_MR_FLAG_DO_NOT_SECURE_VM | hvnd_convert_access(mr->acc); - hvnd_debug("memory register access flags are: %x\n", acc_flags); - - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_MR_REGISTER, 0, 0, 0); - - /* - * The memory registration call uses a different mechanism to pass - * pfn information. - */ - - pkt_type = pkt.hdr.pkt_hdr.packet_type; - NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTION_EXTERNAL_DATA); - pkt.hdr.pkt_hdr.packet_type = pkt_type; - - pkt.hdr.extended_data.size = ext_data_sz; - pkt.hdr.extended_data.offset = 0; - /* - * Now fill out the ioctl. - */ - - pkt.ioctl.in.header.version = ND_VERSION_1; - pkt.ioctl.in.header.flags = acc_flags; - pkt.ioctl.in.header.cb_length = mr->length; - pkt.ioctl.in.header.target_addr = mr->virt; - pkt.ioctl.in.header.mr_handle = mr->mr_handle; - pkt.ioctl.in.address = mr->virt; - - /* - * Now send the packet to the host. - */ - - ret = hvnd_send_pgbuf_ioctl_pkt(nd_dev, - tpb, desc_size, - &pkt.hdr, - sizeof(pkt), - (unsigned long)&pkt); - - if (ret) - goto err; - - hvnd_info("MR REGISTRATION SUCCESS\n"); - /* - * Copy the mr registration data. - */ - hvnd_debug("mr registration lkey %x\n", pkt.ioctl.out.lkey); - hvnd_debug("mr registration rkey %x\n", pkt.ioctl.out.rkey); - - mr->mr_lkey = pkt.ioctl.out.lkey; - mr->mr_rkey = pkt.ioctl.out.rkey; - - mr->ibmr.lkey = mr->mr_lkey; - mr->ibmr.rkey = be32_to_cpu(mr->mr_rkey); - hvnd_debug("ibmr registration lkey %x\n", mr->ibmr.lkey); - hvnd_debug("ibmr registration rkey %x\n", mr->ibmr.rkey); - - free_pages((unsigned long)pb, get_order(sz_leaf)); - free_pages((unsigned long)tpb, get_order(desc_size)); - - return 0; - -err: - free_pages((unsigned long)pb, get_order(sz_leaf)); - free_pages((unsigned long)tpb, get_order(desc_size)); - - hvnd_error("mr register failed: %d\n", ret); - return ret; -} - -/* - * Listener operations. - */ -int hvnd_cr_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 *listener_handle) -{ - struct pkt_nd_cr_listener pkt; - int ret; - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_LISTENER_CREATE, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.hdr.version = ND_VERSION_1; - pkt.ioctl.in.hdr.handle = uctx->adaptor_hdl; - hvnd_debug("Adaptor handle is %p\n", (void *)uctx->adaptor_hdl); - pkt.ioctl.in.hdr.reserved = 0; - pkt.ioctl.in.to_semantics = false; - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - /* - * Copy the listener handle. - */ - hvnd_debug("listener handle is %p\n", (void *)pkt.ioctl.out); - *listener_handle = pkt.ioctl.out; - - return 0; - -err: - hvnd_error("create listener failed: ret=%d uctx=%p adaptor handle=%llu\n", - ret, uctx, uctx->adaptor_hdl); - return ret; - -} - -int hvnd_free_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 listener_handle) -{ - struct pkt_nd_free_listener pkt; - int ret; - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_LISTENER_FREE, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.version = ND_VERSION_1; - pkt.ioctl.in.handle = listener_handle; - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - return 0; - -err: - hvnd_error("free listener failed: %d\n", ret); - return ret; -} - -int hvnd_bind_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 listener_handle, union nd_sockaddr_inet *addr) -{ - struct pkt_nd_bind_listener pkt; - kuid_t uid = current_uid(); - int ret; - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_LISTENER_BIND, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.hdr.version = ND_VERSION_1; - pkt.ioctl.in.hdr.handle = listener_handle; - pkt.ioctl.in.hdr.reserved = 0; - - pkt.ioctl.in.authentication_id = (u32)uid.val; - pkt.ioctl.in.is_admin = false; - - memcpy(&pkt.ioctl.in.hdr.address, addr, sizeof(*addr)); - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - return 0; - -err: - hvnd_error("bind listener failed: %d\n", ret); - return ret; -} - -int hvnd_listen_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 listener_handle, u32 backlog) -{ - struct pkt_nd_listen_listener pkt; - int ret; - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_LISTENER_LISTEN, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.version = ND_VERSION_1; - pkt.ioctl.in.listener_handle = listener_handle; - pkt.ioctl.in.back_log = backlog; - - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - return 0; - -err: - hvnd_error("listen listener failed: %d\n", ret); - return ret; -} - -int hvnd_get_addr_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 listener_handle, union nd_sockaddr_inet *addr) -{ - struct pkt_nd_get_addr_listener pkt; - int ret; - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_LISTENER_GET_ADDRESS, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.version = ND_VERSION_1; - pkt.ioctl.in.handle = listener_handle; - - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - /* - * Copy the adddress. - */ - - memcpy(addr, &pkt.ioctl.out, sizeof(union nd_sockaddr_inet)); - - return 0; - -err: - hvnd_error("listen listener failed: %d\n", ret); - return ret; -} - -int hvnd_get_connection_listener(struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx, - u64 listener_handle, u64 connector_handle, - u64 irp_handle) -{ - struct pkt_nd_get_connection_listener pkt; - int ret; - union ndv_context_handle irp_fhandle; - - ret = get_irp_handle(nd_dev, &irp_fhandle.local, (void *)irp_handle); - - if (ret) { - hvnd_error("get_irp_handle() failed: err: %d\n", ret); - return ret; - } - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST, 0, 0, - irp_fhandle.val64); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.version = ND_VERSION_1; - pkt.ioctl.in.listener_handle = listener_handle; - pkt.ioctl.in.connector_handle = connector_handle; - - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - return 0; - -err: - hvnd_error("get connection listener failed: %d\n", ret); - return ret; -} - -/* - * Connector APIs. - */ - -int hvnd_cr_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 *connector_handle) -{ - struct pkt_nd_cr_connector pkt; - int ret; - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(struct pkt_nd_cr_listener) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_CONNECTOR_CREATE, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.hdr.version = ND_VERSION_1; - pkt.ioctl.in.hdr.handle = uctx->adaptor_hdl; - pkt.ioctl.in.to_semantics = false; - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - /* - * Copy the listener handle. - */ - hvnd_debug("connector handle is %p\n", (void *)pkt.ioctl.out); - *connector_handle = pkt.ioctl.out; - - return 0; - -err: - return ret; -} - -int hvnd_free_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 handle) -{ - struct pkt_nd_free_connector pkt; - int ret; - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_CONNECTOR_FREE, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.version = ND_VERSION_1; - pkt.ioctl.in.handle = handle; - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - return 0; - -err: - return ret; -} - -int hvnd_bind_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 handle, union nd_sockaddr_inet *addr) -{ - struct pkt_nd_bind_connector pkt; - int ret; - kuid_t uid = current_uid(); - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_CONNECTOR_BIND, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.hdr.version = ND_VERSION_1; - pkt.ioctl.in.hdr.handle = handle; - - memcpy(&pkt.ioctl.in.hdr.address, addr, sizeof(*addr)); - - pkt.ioctl.in.authentication_id = (u32)uid.val; - pkt.ioctl.in.is_admin = false; - - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - return 0; - -err: - return ret; -} - -int hvnd_connector_connect(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 connector_handle, u32 in_rd_limit, u32 out_rd_limit, - u32 priv_data_length, const u8 *priv_data, - u64 qp_handle, struct if_physical_addr *phys_addr, - union nd_sockaddr_inet *dest_addr, - struct hvnd_ep_obj *ep) -{ - struct pkt_nd_connector_connect *pkt = &ep->connector_connect_pkt; - int ret; - union ndv_context_handle irp_fhandle; - - hvnd_debug("local irp is %d\n", ep->local_irp); - irp_fhandle.local = ep->local_irp; - - if (priv_data_length > MAX_PRIVATE_DATA_LEN) { - hvnd_error("priv_data_length=%d\n", priv_data_length); - return -EINVAL; - } - - // KYS try to avoid having to zero everything - memset(pkt, 0, sizeof(*pkt)); - hvnd_init_hdr(&pkt->hdr, - sizeof(*pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_CONNECTOR_CONNECT, 0, 0, irp_fhandle.val64); - - /* - * Now fill in the ioctl section. - */ - pkt->ioctl.in.hdr.version = ND_VERSION_1; - pkt->ioctl.in.hdr.connector_handle = connector_handle; - pkt->ioctl.in.hdr.read_limits.inbound = in_rd_limit; - pkt->ioctl.in.hdr.read_limits.outbound = out_rd_limit; - pkt->ioctl.in.hdr.cb_private_data_length = priv_data_length; - pkt->ioctl.in.hdr.cb_private_data_offset = - offsetof(union connector_connect_ioctl, in.priv_data); - pkt->ioctl.in.hdr.qp_handle = qp_handle; - - memcpy(&pkt->ioctl.in.hdr.phys_addr, phys_addr, - sizeof(struct if_physical_addr)); - - /* - * Luke's code does not copy the ip address. - */ - memcpy(&pkt->ioctl.in.hdr.destination_address, dest_addr, - sizeof(union nd_sockaddr_inet)); - - pkt->ioctl.in.retry_cnt = 7; - pkt->ioctl.in.rnr_retry_cnt = 7; - memcpy(pkt->ioctl.in.priv_data, priv_data, priv_data_length); - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt->hdr, sizeof(*pkt), (u64)pkt); - - if (ret) - goto err; - - return 0; - -err: - return ret; -} - -int hvnd_connector_complete_connect(struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx, - u64 connector_handle, - enum ibv_qp_state *qp_state) -{ - struct pkt_nd_connector_connect_complete pkt; - int ret; - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_CONNECTOR_COMPLETE_CONNECT, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.hdr.version = ND_VERSION_1; - pkt.ioctl.in.hdr.handle = connector_handle; - pkt.ioctl.in.rnr_nak_to = 0; - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - *qp_state = pkt.ioctl.out.state; - return 0; - -err: - return ret; -} - -int hvnd_connector_accept(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 connector_handle, - u64 qp_handle, - u32 in_rd_limit, u32 out_rd_limit, - u32 priv_data_length, const u8 *priv_data, - enum ibv_qp_state *qp_state, struct hvnd_ep_obj *ep) -{ - struct pkt_nd_connector_accept pkt; - int ret; - union ndv_context_handle irp_fhandle; - - irp_fhandle.local = ep->local_irp; - - if (priv_data_length > MAX_PRIVATE_DATA_LEN) { - hvnd_error("priv_data_length=%d\n", priv_data_length); - return -EINVAL; - } - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_CONNECTOR_ACCEPT, 0, 0, irp_fhandle.val64); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.hdr.version = ND_VERSION_1; - pkt.ioctl.in.hdr.reserved = 0; - pkt.ioctl.in.hdr.read_limits.inbound = in_rd_limit; - pkt.ioctl.in.hdr.read_limits.outbound = out_rd_limit; - pkt.ioctl.in.hdr.cb_private_data_length = priv_data_length; - - pkt.ioctl.in.hdr.cb_private_data_offset = - offsetof(struct connector_accept_in, private_data); - - pkt.ioctl.in.hdr.connector_handle = connector_handle; - pkt.ioctl.in.hdr.qp_handle = qp_handle; - - pkt.ioctl.in.rnr_nak_to = 0; - pkt.ioctl.in.rnr_retry_cnt = 7; - - - memcpy(pkt.ioctl.in.private_data, priv_data, priv_data_length); - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - *qp_state = pkt.ioctl.out.state; - return 0; - -err: - return ret; -} - -int hvnd_connector_reject(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 connector_handle, - u32 priv_data_length, u8 *priv_data, - enum ibv_qp_state *qp_state) -{ - struct pkt_nd_connector_reject pkt; - int ret; - - if (priv_data_length > MAX_PRIVATE_DATA_LEN) { - hvnd_error("priv_data_length=%d\n", priv_data_length); - return -EINVAL; - } - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_CONNECTOR_REJECT, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.hdr.version = ND_VERSION_1; - pkt.ioctl.in.hdr.reserved = 0; - pkt.ioctl.in.hdr.cb_private_data_length = priv_data_length; - - pkt.ioctl.in.hdr.cb_private_data_offset = - offsetof(struct connector_reject_in, private_data); - - pkt.ioctl.in.hdr.connector_handle = connector_handle; - - memcpy(pkt.ioctl.in.private_data, priv_data, priv_data_length); - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - *qp_state = pkt.ioctl.out.state; - return 0; - -err: - return ret; -} - -int hvnd_connector_get_rd_limits(struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx, - u64 connector_handle, - struct nd_read_limits *rd_limits) -{ - struct pkt_nd_connector_get_rd_limits pkt; - int ret; - - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_CONNECTOR_GET_READ_LIMITS, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.in.version = ND_VERSION_1; - pkt.ioctl.in.in.reserved = 0; - pkt.ioctl.in.in.handle = connector_handle; - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - *rd_limits = pkt.ioctl.out.out; - return 0; - -err: - return ret; -} - -int hvnd_connector_get_priv_data(struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx, - u64 connector_handle, - u8 *priv_data) -{ - struct pkt_nd_connector_get_priv_data pkt; - int ret; - - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.version = ND_VERSION_1; - pkt.ioctl.in.reserved = 0; - pkt.ioctl.in.handle = connector_handle; - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - memcpy(priv_data, pkt.ioctl.out, MAX_PRIVATE_DATA_LEN); - return 0; - -err: - return ret; -} - -int hvnd_connector_get_peer_addr(struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx, - u64 connector_handle, - union nd_sockaddr_inet *peer_addr) -{ - struct pkt_nd_connector_get_peer_addr pkt; - int ret; - - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.version = ND_VERSION_1; - pkt.ioctl.in.reserved = 0; - pkt.ioctl.in.handle = connector_handle; - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - memcpy(peer_addr, &pkt.ioctl.out, sizeof(union nd_sockaddr_inet)); - return 0; - -err: - return ret; -} - -int hvnd_connector_get_local_addr(struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx, - u64 connector_handle, - union nd_sockaddr_inet *addr) -{ - struct pkt_nd_connector_get_addr pkt; - int ret; - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_CONNECTOR_GET_ADDRESS, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.version = ND_VERSION_1; - pkt.ioctl.in.reserved = 0; - pkt.ioctl.in.handle = connector_handle; - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - memcpy(addr, &pkt.ioctl.out, sizeof(union nd_sockaddr_inet)); - return 0; - -err: - return ret; -} - - -int hvnd_connector_notify_disconnect(struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx, - u64 connector_handle, struct hvnd_ep_obj *ep) -{ - struct pkt_nd_connector_notify_disconnect pkt; - int ret; - union ndv_context_handle irp_fhandle; - - irp_fhandle .local = ep->local_irp; - - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT, 0, 0, - irp_fhandle.val64); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.version = ND_VERSION_1; - pkt.ioctl.in.reserved = 0; - pkt.ioctl.in.handle = connector_handle; - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - return 0; - -err: - return ret; -} - - -//ASYNCH call -int hvnd_connector_disconnect(struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx, - u64 connector_handle, struct hvnd_ep_obj *ep) -{ - struct pkt_nd_connector_disconnect pkt; - int ret; - union ndv_context_handle irp_fhandle; - - irp_fhandle.local = ep->local_irp; - - - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_CONNECTOR_DISCONNECT, 0, 0, irp_fhandle.val64); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.version = ND_VERSION_1; - pkt.ioctl.in.reserved = 0; - pkt.ioctl.in.handle = connector_handle; - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - return 0; - -err: - return ret; -} - -/* - * QP operations. - */ -int hvnd_create_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - struct hvnd_qp *qp) -{ - int ret; - struct pkt_nd_create_qp *pkt; - int num_pfn, num_db_pfn; - int qp_pkt_size; - unsigned int offset; - u32 ext_data_offset; - u32 ext_data_size; - - /* - * Now create QP. - * First compute the number of PFNs we need to accomodate: - * One each for door bell and arm_sn and pages in cq buffer. - */ - offset = offset_in_page(qp->qp_buf); - num_pfn = DIV_ROUND_UP(offset + qp->buf_size, PAGE_SIZE); - - offset = offset_in_page(qp->db_addr); - num_db_pfn = DIV_ROUND_UP(offset + 4, PAGE_SIZE); - - qp_pkt_size = sizeof(struct pkt_nd_create_qp) + - (num_pfn * sizeof(u64)); - - hvnd_debug("CREATE QP, num pfns is %d\n", num_pfn); - hvnd_debug("CREATE QP, num DB pfns is %d\n", num_db_pfn); - - pkt = kzalloc(qp_pkt_size, GFP_KERNEL); - - if (!pkt) - return -ENOMEM; - - hvnd_debug("offset of nd_create_qp is %d\n", - (int)offsetof(struct pkt_nd_create_qp, ioctl.input)); - - ext_data_offset = offsetof(struct pkt_nd_create_qp, ext_data) - - sizeof(struct ndv_packet_hdr_control_1); - - ext_data_size = sizeof(struct create_qp_ext_data) + - (num_pfn * sizeof(u64)); - - hvnd_init_hdr(&pkt->hdr, - qp_pkt_size - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_QP_CREATE, - ext_data_size, - ext_data_offset, - 0); - - /* - * Now fill out the ioctl section. - */ - - pkt->ioctl.input.hdr.version = ND_VERSION_1; - - if (qp->max_inline_data > nd_dev->query_pkt.ioctl.ad_info.inline_request_threshold) - qp->max_inline_data = nd_dev->query_pkt.ioctl.ad_info.inline_request_threshold; - pkt->ioctl.input.hdr.cb_max_inline_data = qp->max_inline_data; - - hvnd_debug("pkt->ioctl.input.hdr.cb_max_inline_data=%d\n", - pkt->ioctl.input.hdr.cb_max_inline_data); - - pkt->ioctl.input.hdr.ce_mapping_cnt = MLX4_IB_CREATE_QP_MAPPINGS_MAX; - pkt->ioctl.input.hdr.cb_mapping_offset = sizeof(union create_qp_ioctl); - - pkt->ioctl.input.hdr.initiator_queue_depth = qp->initiator_q_depth; - pkt->ioctl.input.hdr.max_initiator_request_sge = - qp->initiator_request_sge; - - hvnd_debug("recv cq handle is %p\n", (void *)qp->receive_cq_handle); - hvnd_debug("send cq handle is %p\n", (void *)qp->initiator_cq_handle); - hvnd_debug("pd handle is %p\n", (void *)qp->pd_handle); - pkt->ioctl.input.hdr.receive_cq_handle = qp->receive_cq_handle; - pkt->ioctl.input.hdr.initiator_cq_handle = qp->initiator_cq_handle; - pkt->ioctl.input.hdr.pd_handle = qp->pd_handle; - - - hvnd_debug("ce_mapping cnt is %d\n", - pkt->ioctl.input.hdr.ce_mapping_cnt); - hvnd_debug("cb_mapping offset is %d\n", - pkt->ioctl.input.hdr.cb_mapping_offset); - - pkt->ioctl.input.receive_queue_depth = qp->receive_q_depth; - pkt->ioctl.input.max_receive_request_sge = qp->receive_request_sge; - - - pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_BUF].map_memory.map_type = ND_MAP_MEMORY; - pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_BUF].map_memory.access_type = ND_MODIFY_ACCESS; - pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_BUF].map_memory.address = (u64)qp->qp_buf; - pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_BUF].map_memory.cb_length = qp->buf_size; - - pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_DB].map_memory.map_type = ND_MAP_MEMORY_COALLESCE; - pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_DB].map_memory.access_type = ND_WRITE_ACCESS; - pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_DB].map_memory.address = (u64)qp->db_addr; - pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_DB].map_memory.cb_length = 4; - - pkt->mappings.qp_in.log_sq_bb_count = qp->log_sq_bb_count; - pkt->mappings.qp_in.log_sq_stride = qp->log_sq_stride; - pkt->mappings.qp_in.sq_no_prefetch = qp->sq_no_prefetch; - - - /* - * Fill in the extended data. - */ - - pkt->ext_data.cnt = 2; - pkt->ext_data.fields[MLX4_IB_CREATE_QP_BUF].size = - sizeof(struct gpa_range) + (num_pfn * sizeof(u64)); - pkt->ext_data.fields[MLX4_IB_CREATE_QP_BUF].offset = - offsetof(struct create_qp_ext_data, qpbuf_gpa); - - pkt->ext_data.fields[MLX4_IB_CREATE_QP_DB].size = - sizeof(struct qp_db_gpa); - pkt->ext_data.fields[MLX4_IB_CREATE_QP_DB].offset = - offsetof(struct create_qp_ext_data, db_gpa); - - /* - * Fill up the gpa range for qp buffer. - */ - - pkt->ext_data.db_gpa.byte_count = 4; // KYS 8 or 16? - pkt->ext_data.db_gpa.byte_offset = offset_in_page(qp->db_addr); - user_va_init_pfn(&pkt->ext_data.db_gpa.pfn_array[0], qp->db_umem); - - pkt->ext_data.qpbuf_gpa.byte_count = qp->buf_size; - pkt->ext_data.qpbuf_gpa.byte_offset = offset_in_page(qp->qp_buf); - user_va_init_pfn(&pkt->ext_data.qpbuf_gpa.pfn_array[0], qp->umem); - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt->hdr, qp_pkt_size, (u64)pkt); - - if (ret) - goto cr_qp_err; - - /* - * Copy the necessary response from the host. - */ - qp->qp_handle = pkt->ioctl.resrc_desc.handle; - - qp->qpn = pkt->mappings.qp_resp.qpn; - qp->max_send_wr = pkt->mappings.qp_resp.max_send_wr; - qp->max_recv_wr = pkt->mappings.qp_resp.max_recv_wr; - qp->max_send_sge = pkt->mappings.qp_resp.max_send_sge; - qp->max_recv_sge = pkt->mappings.qp_resp.max_recv_sge; - - - hvnd_debug("qp->max_send_wr=%d max_recv_wr=%d max_send_sge=%d max_recv_sge=%d max_inline_data=%d\n", - qp->max_send_wr, qp->max_recv_wr, qp->max_send_sge, - qp->max_recv_sge, qp->max_inline_data); - - ret = insert_handle(nd_dev, &nd_dev->qpidr, qp, qp->qpn); - - if (ret) - goto cr_qp_err; - - hvnd_debug("QP create after success qpn:%d qp:%p handle:%llu\n", - qp->qpn, qp, qp->qp_handle); - -cr_qp_err: - kfree(pkt); - return ret; -} - -int hvnd_free_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - struct hvnd_qp *qp) -{ - int ret; - - ret = hvnd_free_handle(nd_dev, uctx, qp->qp_handle, IOCTL_ND_QP_FREE); - if (ret == 0) - remove_handle(nd_dev, &nd_dev->qpidr, qp->qpn); - return ret; -} - -int hvnd_flush_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - struct hvnd_qp *qp) -{ - struct pkt_nd_flush_qp pkt; - int ret; - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_QP_FLUSH, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.version = ND_VERSION_1; - pkt.ioctl.in.reserved = 0; - pkt.ioctl.in.handle = qp->qp_handle; - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - return 0; - -err: - return ret; -} - - -int hvnd_bind_nic(struct hvnd_dev *nd_dev, bool un_bind, char *ip_addr, - char *mac_addr) -{ - int ret; - int pkt_type = NDV_PKT_ID1_BIND; - - /* - * Send the bind information over to the host. - * For now, we will have a single ip and MAC address that we - * will deal with. Down the road we will need to expand support - * for multiple IP and MAC addresses and also deal with changing - * IP addresses. - */ - - NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); - hvnd_debug("bind packet type is %d ID:%d\n", - pkt_type, NDV_PACKET_TYPE_ID(pkt_type)); - nd_dev->bind_pkt.pkt_hdr.packet_type = pkt_type; - - nd_dev->bind_pkt.pkt_hdr.hdr_sz = sizeof(struct ndv_pkt_hdr_bind_1); - hvnd_debug("bind packet size is %d\n", - (int)sizeof(struct ndv_pkt_hdr_bind_1)); - nd_dev->bind_pkt.pkt_hdr.data_sz = 0; - nd_dev->bind_pkt.unbind = un_bind; - nd_dev->bind_pkt.ip_address.address_family = AF_INET; - nd_dev->bind_pkt.ip_address.ipv4.sin_family = AF_INET; - nd_dev->bind_pkt.ip_address.ipv4.sin_port = 0; - nd_dev->bind_pkt.ip_address.ipv4.sin_addr.s_addr = *(unsigned int *)ip_addr; - - nd_dev->bind_pkt.phys_addr.length = ETH_ALEN; - memcpy(nd_dev->bind_pkt.phys_addr.addr, mac_addr, ETH_ALEN); - - /* - * This is the adapter handle; needs to be unique for each - * MAC, ip address tuple. - */ - nd_dev->bind_pkt.guest_id = (u64)nd_dev; - - ret = hvnd_send_packet(nd_dev, &nd_dev->bind_pkt, - sizeof(struct ndv_pkt_hdr_bind_1), - (u64)NULL, - true); - return ret; -} - -int hvnd_init_resources(struct hvnd_dev *nd_dev) -{ - unsigned long mmio_sz; - struct resource *resrc; - int ret = -ENOMEM; - - resrc = &iomem_resource; - - mmio_sz = nd_dev->hvdev->channel->offermsg.offer.mmio_megabytes * - 1024 * 1024; - nd_dev->mmio_sz = mmio_sz; - nd_dev->mmio_resource.name = KBUILD_MODNAME; - nd_dev->mmio_resource.flags = IORESOURCE_MEM | IORESOURCE_BUSY; - - ret = allocate_resource(resrc, &nd_dev->mmio_resource, - mmio_sz, 0, -1, mmio_sz, NULL, NULL); - - if (ret) { - hvnd_error("Unable to allocate mmio resources\n"); - return ret; - } - hvnd_debug("MMIO start is %p\n", (void *)nd_dev->mmio_resource.start); - - /* - * Send the mmio information over to the host. - */ - nd_dev->resources.pkt_hdr.packet_type = NDV_PKT_ID1_INIT_RESOURCES; - nd_dev->resources.pkt_hdr.hdr_sz = sizeof(union ndv_packet_hdr); - nd_dev->resources.pkt_hdr.data_sz = 0; - - nd_dev->resources.io_space_sz_mb = mmio_sz; - nd_dev->resources.io_space_start = nd_dev->mmio_resource.start; - - ret = hvnd_send_packet(nd_dev, &nd_dev->resources, - sizeof(struct ndv_pkt_hdr_init_resources_1), - (u64)NULL, - true); - return ret; -} - -int hvnd_query_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx) -{ - struct pkt_nd_query_adaptor *pkt; - int ret; - int pkt_type; - struct ib_device_attr *props = &nd_dev->ibdev.attrs; - struct adapter_info_v2 *adap_info = &nd_dev->query_pkt.ioctl.ad_info; - - hvnd_debug("Performing Adapter query nd_dev=%p\n", nd_dev); - - // check if there is a need to do query - if (nd_dev->query_pkt_set) - return 0; - - // need a lock, multiple process can call this at the same time - - down(&nd_dev->query_pkt_sem); - if (nd_dev->query_pkt_set) { - up(&nd_dev->query_pkt_sem); - return 0; - } - - /* - * Now query the adaptor. - */ - - pkt = &nd_dev->query_pkt; - - pkt_type = NDV_PKT_ID1_CONTROL; - NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); - - pkt->hdr.pkt_hdr.packet_type = pkt_type; - pkt->hdr.pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); - pkt->hdr.pkt_hdr.data_sz = sizeof(struct pkt_nd_query_adaptor) - - sizeof(struct ndv_packet_hdr_control_1); - - - pkt->hdr.file_handle.local = uctx->file_handle.local; - pkt->hdr.file_handle.remote = uctx->file_handle.remote; - - pkt->hdr.irp_handle.val64 = 0; - - pkt->hdr.io_cntrl_code = IOCTL_ND_ADAPTER_QUERY; - pkt->hdr.output_buf_sz = sizeof(struct nd_adap_query_ioctl); - pkt->hdr.input_buf_sz = sizeof(struct nd_adap_query_ioctl); - pkt->hdr.input_output_buf_offset = 0; - memset(&pkt->ioctl.ad_q, 0, sizeof(struct nd_adap_query_ioctl)); - - pkt->ioctl.ad_q.version = ND_VERSION_1; - pkt->ioctl.ad_q.info_version = ND_VERSION_2; - pkt->ioctl.ad_q.adapter_handle = uctx->adaptor_hdl; - - ret = hvnd_send_packet(nd_dev, pkt, - sizeof(struct pkt_nd_query_adaptor), - (unsigned long)pkt, true); - - hvnd_debug("pkt->ioctl.ad_info.inline_request_threshold=%d\n", - pkt->ioctl.ad_info.inline_request_threshold); - nd_dev->query_pkt_set = true; - - // how about host returning PENDING - up(&nd_dev->query_pkt_sem); - - if (ret) - return ret; - - hvnd_debug("Query Adaptor Succeeded\n"); - - /* - * Cache the relevant properties out. - */ - props->fw_ver = 0; - props->device_cap_flags = 0; - props->vendor_id = 0x15b3; - props->vendor_part_id = adap_info->device_id; - props->max_mr_size = ~0ull; - props->page_size_cap = PAGE_SIZE; - props->max_qp = 16384; - props->max_qp_wr = min(adap_info->max_recv_q_depth, - adap_info->max_initiator_q_depth); - props->max_sge = min(adap_info->max_initiator_sge, - adap_info->max_recv_sge); - props->max_cq = 0x1FFFF; - props->max_cqe = adap_info->max_completion_q_depth; - props->max_mr = 16384; - props->max_pd = 16384; - props->max_qp_rd_atom = adap_info->max_inbound_read_limit; - props->max_qp_init_rd_atom = adap_info->max_outbound_read_limit; - props->max_res_rd_atom = props->max_qp_rd_atom * props->max_qp; - props->max_srq = 16384; - props->max_srq_wr = adap_info->max_recv_q_depth; - props->max_srq_sge = adap_info->max_recv_sge; - return 0; -} - - -int hvnd_create_pd(struct hvnd_ucontext *uctx, struct hvnd_dev *nd_dev, - struct hvnd_ib_pd *hvnd_pd) -{ - struct pkt_nd_pd_create *pkt = &uctx->pd_cr_pkt; - int ret; - int pkt_type; - - hvnd_debug("Create Protection Domain\n"); - - pkt_type = NDV_PKT_ID1_CONTROL; - NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); - - pkt->hdr.pkt_hdr.packet_type = pkt_type; - pkt->hdr.pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); - pkt->hdr.pkt_hdr.data_sz = sizeof(struct pkt_nd_pd_create) - - sizeof(struct ndv_packet_hdr_control_1); - - hvnd_debug("pdcreate packet size: %d\n", - (int)sizeof(struct pkt_nd_pd_create)); - hvnd_debug("pdcreate hdr size: %d\n", - (int)sizeof(struct ndv_packet_hdr_control_1)); - hvnd_debug("pdcreate data size: %d\n", pkt->hdr.pkt_hdr.data_sz); - - pkt->hdr.file_handle.local = uctx->create_pkt.handle.local; - pkt->hdr.file_handle.remote = uctx->create_pkt.handle.remote; - - hvnd_debug("create pd uctx is %p\n", uctx); - hvnd_debug("create pd local file is %d\n", - uctx->create_pkt.handle.local); - hvnd_debug("create pd local file is %d\n", - uctx->create_pkt.handle.remote); - - pkt->hdr.irp_handle.val64 = 0; - pkt->hdr.io_cntrl_code = IOCTL_ND_PD_CREATE; - - pkt->hdr.output_buf_sz = sizeof(struct nd_create_pd_ioctl); - pkt->hdr.input_buf_sz = sizeof(struct nd_create_pd_ioctl); - pkt->hdr.input_output_buf_offset = 0; - - hvnd_debug("output/input buf size: %d\n", pkt->hdr.output_buf_sz); - - /* - * Fill the ioctl section. - */ - pkt->ioctl.in.version = ND_VERSION_1; - pkt->ioctl.in.reserved = 0; - pkt->ioctl.in.handle = uctx->adaptor_hdl; - - - ret = hvnd_send_packet(nd_dev, pkt, - sizeof(struct pkt_nd_pd_create), - (unsigned long)pkt, true); - - if (ret) - return ret; - - if (pkt->hdr.pkt_hdr.status != 0) { - hvnd_error("Create PD failed; status is %d\n", - pkt->hdr.pkt_hdr.status); - return -EINVAL; - } - if (pkt->hdr.io_status != 0) { - hvnd_error("Create PD failed;io status is %d\n", - pkt->hdr.io_status); - return -EINVAL; - } - - hvnd_debug("Create PD Succeeded\n"); - - hvnd_debug("pd_handle is %p\n", (void *)pkt->ioctl.resp.pd_handle); - hvnd_debug("pdn is %d\n", (int)pkt->ioctl.resp.pdn); - - hvnd_pd->pdn = pkt->ioctl.resp.pdn; - hvnd_pd->handle = pkt->ioctl.out_handle; - - return 0; -} - -int hvnd_cancel_io(struct hvnd_ep_obj *ep_object) -{ - struct pkt_nd_cancel_io pkt; - int ret; - u32 ioctl; - - switch (ep_object->type) { - case ND_LISTENER: - hvnd_debug("LISTENER I/O Cancelled\n"); - ioctl = IOCTL_ND_LISTENER_CANCEL_IO; - break; - case ND_CONNECTOR: - hvnd_debug("CONNECTOR I/O Cancelled\n"); - ioctl = IOCTL_ND_CONNECTOR_CANCEL_IO; - break; - case ND_MR: - hvnd_debug("MR I/O Cancelled\n"); - ioctl = IOCTL_ND_MR_CANCEL_IO; - break; - case ND_CQ: - hvnd_debug("CQ I/O Cancelled\n"); - ioctl = IOCTL_ND_CQ_CANCEL_IO; - break; - default: - hvnd_error("UNKNOWN object type\n"); - return -EINVAL; - } - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - ep_object->uctx->create_pkt.handle.local, - ep_object->uctx->create_pkt.handle.remote, - ioctl, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.version = ND_VERSION_1; - pkt.ioctl.in.reserved = 0; - pkt.ioctl.in.handle = ep_object->ep_handle; - hvnd_debug("cancel io handle is %p\n", (void *)ep_object->ep_handle); - - ret = hvnd_send_ioctl_pkt(ep_object->nd_dev, &pkt.hdr, - sizeof(pkt), - (u64)&pkt); - - if (ret) - goto err; - - /* - * Now that we have cancelled all I/Os, - */ - - return 0; - -err: - hvnd_error("cancel I/O operation failed\n"); - return ret; -} - - -int hvnd_free_handle(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 handle, u32 ioctl) -{ - struct pkt_nd_free_handle pkt; - int ret; - - hvnd_debug("Freeing handle ioctl is %s; handle is %p\n", - hvnd_get_op_name(ioctl), (void *)handle); - - hvnd_debug("uctx is %p\n", uctx); - hvnd_debug("nd_dev is %p\n", nd_dev); - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - ioctl, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.version = ND_VERSION_1; - pkt.ioctl.in.reserved = 0; - pkt.ioctl.in.handle = handle; - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - return 0; - -err: - hvnd_error("%s: ret=%d\n", __func__, ret); - return ret; -} - -int hvnd_negotiate_version(struct hvnd_dev *nd_dev) -{ - union ndv_packet_init *pkt = &nd_dev->init_pkt; - int ret; - - nd_dev->negotiated_version = NDV_PROTOCOL_VAERSION_INVALID; - - pkt->packet_type = NDV_PACKET_TYPE_INIT; - pkt->protocol_version = NDV_PROTOCOL_VERSION_CURRENT; - pkt->flags = 0; // KYS are the flags 0? - - ret = hvnd_send_packet(nd_dev, pkt, - sizeof(union ndv_packet_init), (u64)NULL, true); - - return ret; -} - -void hvnd_callback(void *context) -{ - struct hv_device *dev = context; - struct hvnd_dev *nd_dev = hv_get_drvdata(dev); - int copy_sz = 0; - struct ndv_packet_hdr_control_1 *ctrl_hdr; - union ndv_packet_init *pkt_init; - u32 recvlen; - u32 local_irp; - u64 requestid; - u32 *pkt_type; - u32 pkt_id; - struct hvnd_ep_obj *ep_object; - struct incoming_pkt *incoming_pkt; /* Used only for asynch calls */ - char *incoming_pkt_start; - struct vmpacket_descriptor *desc; - int status; - struct hvnd_cookie *hvnd_cookie; - unsigned long flags; - - vmbus_recvpacket_raw(dev->channel, hvnd_recv_buffer, - (PAGE_SIZE * 4), &recvlen, &requestid); - - if (recvlen <= 0) - return; - - desc = (struct vmpacket_descriptor *)hvnd_recv_buffer; - incoming_pkt_start = hvnd_recv_buffer + (desc->offset8 << 3); - recvlen -= desc->offset8 << 3; - - pkt_type = (u32 *)incoming_pkt_start; - pkt_id = *pkt_type; - if (pkt_id != NDV_PACKET_TYPE_INIT) - pkt_id = NDV_PACKET_TYPE_ID(pkt_id); - - switch (pkt_id) { - case NDV_PACKET_TYPE_INIT: - /* - * Host is responding to our init packet. - */ - pkt_init = (union ndv_packet_init *)incoming_pkt_start; - nd_dev->negotiated_version = pkt_init->protocol_version; - copy_sz = 0; - break; - - case NDV_PKT_ID1_INIT_RESOURCES: - copy_sz = 0; - break; - - case NDV_PKT_ID1_BIND: - nd_dev->bind_pkt.pkt_hdr.status = - ((union ndv_packet_hdr *) incoming_pkt_start)->status; - copy_sz = 0; - break; - - case NDV_PKT_ID1_COMPLETE: - ctrl_hdr = (struct ndv_packet_hdr_control_1 *)incoming_pkt_start; - status = ctrl_hdr->io_status; - - local_irp = ctrl_hdr->irp_handle.local; - ep_object = (struct hvnd_ep_obj *)map_irp_to_ctx(nd_dev, - local_irp); - - if (!ep_object) { - hvnd_error("irp could not be mapped; irp is %d ioctl is %s", - local_irp, - hvnd_get_op_name(ctrl_hdr->io_cntrl_code)); - goto complete; - } - - if (ctrl_hdr->io_cntrl_code != IOCTL_ND_CQ_NOTIFY) - hvnd_debug("completion packet; iostatus is %x, ioctl is %s", - ctrl_hdr->io_status, - hvnd_get_op_name(ctrl_hdr->io_cntrl_code)); - - switch (ctrl_hdr->io_cntrl_code) { - - case IOCTL_ND_CQ_NOTIFY: - hvnd_process_cq_event_complete(ep_object, status); - - ep_del_work_pending(ep_object); - goto complete; - - case IOCTL_ND_CONNECTOR_ACCEPT: - - hvnd_process_connector_accept(ep_object, status); - - ep_del_work_pending(ep_object); - goto complete; - - case IOCTL_ND_CONNECTOR_DISCONNECT: - hvnd_debug("disconnected: ep opj is %p; status: %d\n", - ep_object, status); - hvnd_process_disconnect(ep_object, status); - - ep_del_work_pending(ep_object); - goto complete; - - default: - break; - } - - /* - * This is the completion notification; - * the IRP cookie is the state through which - * we will invoke the callback. - */ - incoming_pkt = kmalloc(recvlen + sizeof(struct incoming_pkt), - GFP_ATOMIC); - if (incoming_pkt == NULL) { - hvnd_error("Could not alloc memory in callback\n"); - ep_del_work_pending(ep_object); - goto complete; - } - memcpy(incoming_pkt->pkt, incoming_pkt_start, recvlen); - - spin_lock_irqsave(&ep_object->incoming_pkt_list_lock, flags); - list_add_tail(&incoming_pkt->list_entry, - &ep_object->incoming_pkt_list); - spin_unlock_irqrestore(&ep_object->incoming_pkt_list_lock, - flags); - - schedule_work(&ep_object->wrk.work); - - goto complete; - - case NDV_PKT_ID1_CREATE: - copy_sz = sizeof(struct ndv_pkt_hdr_create_1); - break; - - case NDV_PKT_ID1_CLEANUP: - copy_sz = sizeof(struct ndv_pkt_hdr_cleanup_1); - break; - - case NDV_PKT_ID1_CONTROL: - ctrl_hdr = (struct ndv_packet_hdr_control_1 *)incoming_pkt_start; - status = ctrl_hdr->io_status; - - if (ctrl_hdr->io_cntrl_code != IOCTL_ND_CQ_NOTIFY) - hvnd_debug("packet; iostatus is %x ioctl is %s", - ctrl_hdr->io_status, hvnd_get_op_name(ctrl_hdr->io_cntrl_code)); - - switch (ctrl_hdr->io_cntrl_code) { - - case IOCTL_ND_PROVIDER_INIT: - copy_sz = sizeof(struct pkt_nd_provider_ioctl); - break; - - case IOCTL_ND_PROVIDER_BIND_FILE: - copy_sz = sizeof(struct pkt_nd_provider_ioctl); - break; - - case IOCTL_ND_ADAPTER_OPEN: - copy_sz = sizeof(struct pkt_nd_open_adapter); - break; - - case IOCTL_ND_ADAPTER_CLOSE: - copy_sz = sizeof(struct pkt_nd_free_handle); - break; - - case IOCTL_ND_ADAPTER_QUERY: - copy_sz = sizeof(struct pkt_nd_query_adaptor); - break; - - case IOCTL_ND_PD_CREATE: - copy_sz = sizeof(struct pkt_nd_pd_create); - break; - - case IOCTL_ND_PD_FREE: - copy_sz = sizeof(struct pkt_nd_free_handle); - break; - - case IOCTL_ND_CQ_CREATE: - copy_sz = sizeof(struct pkt_nd_create_cq); - break; - - case IOCTL_ND_CQ_FREE: - copy_sz = sizeof(struct pkt_nd_free_cq); - break; - - case IOCTL_ND_CQ_NOTIFY: //FIXME check ep stop state - local_irp = ctrl_hdr->irp_handle.local; - ep_object = (struct hvnd_ep_obj *) - map_irp_to_ctx(nd_dev, local_irp); - if (!ep_object) { - hvnd_error("irp could not be mapped\n"); - goto complete; - return; - } - copy_sz = sizeof(struct pkt_nd_notify_cq); - hvnd_process_cq_event_pending(ep_object, status); - goto complete; - return; - - case IOCTL_ND_LISTENER_CREATE: - copy_sz = sizeof(struct pkt_nd_cr_listener); - break; - - case IOCTL_ND_LISTENER_FREE: - copy_sz = sizeof(struct pkt_nd_free_listener); - break; - - case IOCTL_ND_QP_FREE: - copy_sz = sizeof(struct pkt_nd_free_handle); - break; - - case IOCTL_ND_CONNECTOR_CANCEL_IO: - case IOCTL_ND_MR_CANCEL_IO: - case IOCTL_ND_CQ_CANCEL_IO: - case IOCTL_ND_LISTENER_CANCEL_IO: - copy_sz = sizeof(struct pkt_nd_cancel_io); - break; - - case IOCTL_ND_LISTENER_BIND: - copy_sz = sizeof(struct pkt_nd_bind_listener); - break; - - case IOCTL_ND_LISTENER_LISTEN: - copy_sz = sizeof(struct pkt_nd_listen_listener); - break; - - case IOCTL_ND_LISTENER_GET_ADDRESS: - copy_sz = sizeof(struct pkt_nd_get_addr_listener); - break; - - case IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST: - copy_sz = sizeof(struct pkt_nd_get_connection_listener); - goto complete; // non-block - - case IOCTL_ND_CONNECTOR_CREATE: - copy_sz = sizeof(struct pkt_nd_cr_connector); - break; - - case IOCTL_ND_CONNECTOR_FREE: - copy_sz = sizeof(struct pkt_nd_free_connector); - break; - - case IOCTL_ND_CONNECTOR_BIND: - copy_sz = sizeof(struct pkt_nd_free_connector); - break; - - // KYS: ALERT: ASYNCH Operation - case IOCTL_ND_CONNECTOR_CONNECT: - copy_sz = sizeof(struct pkt_nd_connector_connect); - goto complete; //non-block - - case IOCTL_ND_CONNECTOR_COMPLETE_CONNECT: - copy_sz = sizeof(struct pkt_nd_connector_connect_complete); - break; - - // KYS: ALERT: ASYNCH Operation - case IOCTL_ND_CONNECTOR_ACCEPT: - copy_sz = sizeof(struct pkt_nd_connector_accept); - goto complete; //non-block - - case IOCTL_ND_CONNECTOR_REJECT: - copy_sz = sizeof(struct pkt_nd_connector_reject); - break; - - case IOCTL_ND_CONNECTOR_GET_READ_LIMITS: - copy_sz = sizeof(struct pkt_nd_connector_get_rd_limits); - break; - - case IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA: - copy_sz = sizeof(struct pkt_nd_connector_get_priv_data); - break; - - case IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS: - copy_sz = sizeof(struct pkt_nd_connector_get_peer_addr); - break; - - case IOCTL_ND_CONNECTOR_GET_ADDRESS: - copy_sz = sizeof(struct pkt_nd_connector_get_addr); - break; - - // KYS: ALERT: ASYNCH Operation - case IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT: - copy_sz = sizeof(struct pkt_nd_connector_notify_disconnect); - goto complete; //non-block - - // KYS: ALERT: ASYNCH Operation - case IOCTL_ND_CONNECTOR_DISCONNECT: - hvnd_debug("IOCTL_ND_CONNECTOR_DISCONNECT\n"); - copy_sz = sizeof(struct pkt_nd_connector_notify_disconnect); - goto complete; // non-block - - case IOCTL_ND_QP_CREATE: - copy_sz = sizeof(struct pkt_nd_create_qp); - break; - - case IOCTL_ND_MR_CREATE: - copy_sz = sizeof(struct pkt_nd_create_mr); - break; - - case IOCTL_ND_MR_FREE: - copy_sz = sizeof(struct pkt_nd_free_handle); - break; - - case IOCTL_ND_MR_REGISTER: - copy_sz = sizeof(struct pkt_nd_register_mr); - break; - - case IOCTL_ND_MR_DEREGISTER: - copy_sz = sizeof(struct pkt_nd_deregister_mr); - break; - - case IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST: - copy_sz = sizeof(struct pkt_query_addr_list); - break; - - case IOCTL_ND_QP_FLUSH: - copy_sz = sizeof(struct pkt_nd_flush_qp); - break; - - default: - hvnd_warn("Got unknown ioctl: %d\n", - ctrl_hdr->io_cntrl_code); - copy_sz = 0; - break; - } - - break; - default: - hvnd_warn("Got an unknown packet type %d\n", *pkt_type); - break; - } - - hvnd_cookie = (struct hvnd_cookie *)requestid; - memcpy(hvnd_cookie->pkt, incoming_pkt_start, copy_sz); - complete(&hvnd_cookie->host_event); - -complete: - /* send out ioctl completion patcket */ - if (desc->flags & VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED) { - int retry = 5; - - while (true) { - int ret; - - ret = vmbus_sendpacket(dev->channel, NULL, 0, - requestid, VM_PKT_COMP, 0); - if (ret == 0) { - break; - } else if (ret == -EAGAIN) { - if (--retry == 0) { - hvnd_error("give up retrying send completion packet\n"); - break; - } - hvnd_warn("retrying send completion packet\n"); - udelay(100); - } else { - hvnd_error("unable to send completion packet ret=%d\n", - ret); - break; - } - } - } - -} diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/vmbus_rdma.h b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/vmbus_rdma.h deleted file mode 100644 index 78e534fa8748..000000000000 --- a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/vmbus_rdma.h +++ /dev/null @@ -1,2202 +0,0 @@ -/* - * Copyright (c) 2014, Microsoft Corporation. - * - * Author: - * K. Y. Srinivasan <kys@microsoft.com> - * - * This program is free software; you can redistribute it and/or modify it - * under the terms of the GNU General Public License version 2 as published - * by the Free Software Foundation. - * - * This program is distributed in the hope that it will be useful, but - * WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or - * NON INFRINGEMENT. See the GNU General Public License for more - * details. - * - * Bug fixes/enhancements: Long Li <longli@microsoft.com> - */ - - -#ifndef _VMBUS_RDMA_H -#define _VMBUS_RDMA_H - - -#include <linux/in.h> -#include <linux/in6.h> -#include <rdma/ib_verbs.h> -#include <linux/idr.h> -#include <linux/if_ether.h> - -/* NetworkDirect version Numbers. - */ -#define ND_VERSION_1 0x1 -#define ND_VERSION_2 0x20000 - -#ifndef NDVER -#define NDVER ND_VERSION_2 -#endif - -#define ND_ADAPTER_FLAG_IN_ORDER_DMA_SUPPORTED 0x00000001 -#define ND_ADAPTER_FLAG_CQ_INTERRUPT_MODERATION_SUPPORTED 0x00000004 -#define ND_ADAPTER_FLAG_MULTI_ENGINE_SUPPORTED 0x00000008 -#define ND_ADAPTER_FLAG_CQ_RESIZE_SUPPORTED 0x00000100 -#define ND_ADAPTER_FLAG_LOOPBACK_CONNECTIONS_SUPPORTED 0x00010000 - -#define ND_CQ_NOTIFY_ERRORS 0 -#define ND_CQ_NOTIFY_ANY 1 -#define ND_CQ_NOTIFY_SOLICITED 2 - -#define ND_MR_FLAG_ALLOW_LOCAL_WRITE 0x00000001 -#define ND_MR_FLAG_ALLOW_REMOTE_READ 0x00000002 -#define ND_MR_FLAG_ALLOW_REMOTE_WRITE 0x00000005 -#define ND_MR_FLAG_RDMA_READ_SINK 0x00000008 -#define ND_MR_FLAG_DO_NOT_SECURE_VM 0x80000000 - -#define ND_OP_FLAG_SILENT_SUCCESS 0x00000001 -#define ND_OP_FLAG_READ_FENCE 0x00000002 -#define ND_OP_FLAG_SEND_AND_SOLICIT_EVENT 0x00000004 -#define ND_OP_FLAG_ALLOW_READ 0x00000008 -#define ND_OP_FLAG_ALLOW_WRITE 0x00000010 - -#if NDVER >= ND_VERSION_2 -#define ND_OP_FLAG_INLINE 0x00000020 -#endif - -#define ND_AF_INET6 23 -#define IF_MAX_ADDR_LENGTH 32 - -struct group_affinity { - u64 mask; //KYS: usually 0 - u16 group; // KYS usually -1 - u16 reserved[3]; -}; - -struct if_physical_addr { - u16 length; - u8 addr[IF_MAX_ADDR_LENGTH]; -}; - -struct adapter_info_v2 { - u32 info_version; - u16 vendor_id; - u16 device_id; - u64 adapter_id; - size_t max_registration_size; - size_t max_window_size; - u32 max_initiator_sge; - u32 max_recv_sge; - u32 max_read_sge; - u32 max_transfer_length; - u32 max_inline_data_size; - u32 max_inbound_read_limit; - u32 max_outbound_read_limit; - u32 max_recv_q_depth; - u32 max_initiator_q_depth; - u32 max_shared_recv_q_depth; - u32 max_completion_q_depth; - u32 inline_request_threshold; - u32 large_request_threshold; - u32 max_caller_data; - u32 max_callee_data; - u32 adapter_flags; -} __packed; - -struct nd2_adapter_info_32 { //KYS: Check what this is - u32 info_version; - u16 vendor_id; - u16 devic_id; - u64 adapter_id; - u32 max_registration_size; - u32 max_window_size; - u32 max_initiator_sge; - u32 max_recv_sge; - u32 max_read_sge; - u32 max_transfer_length; - u32 max_inline_data_size; - u32 max_inbound_read_limit; - u32 max_outbound_read_limit; - u32 max_recv_q_depth; - u32 max_initiator_q_depth; - u32 max_shared_recv_q_depth; - u32 max_completion_q_depth; - u32 inline_request_threshold; - u32 large_request_threshold; - u32 max_caller_data; - u32 max_callee_data; - u32 adapter_flags; -} __packed; - -enum nd2_request_type { - ND2_RT_RECEIVE, - ND2_RT_SEND, - ND2_RT_BIND, - ND2_RT_INVALIDATE, - ND2_RT_READ, - ND2_RT_WRITE -}; - -struct nd2_result { - u32 status; - u32 bytes_transferred; - void *qp_ctx; - void *request_ctx; - enum nd2_request_type request_type; -} __packed; - -struct nd2_sge { - void *buffer; - u32 buffer_length; - u32 mr_token; -} __packed; - -/* - * The communication with the host via ioctls using VMBUS - * as the transport. - */ - -#define ND_IOCTL_VERSION 1 - -enum nd_mapping_type { - ND_MAP_IOSPACE, - ND_MAP_MEMORY, - ND_MAP_MEMORY_COALLESCE, - ND_MAP_PAGES, - ND_MAP_PAGES_COALLESCE, - ND_UNMAP_IOSPACE, - ND_UNMAP_MEMORY, - ND_MAX_MAP_TYPE -}; - -enum nd_caching_type { - ND_NON_CACHED = 0, - ND_CACHED, - ND_WRITE_COMBINED, - ND_MAX_CACHE_TYPE -}; - -enum nd_aceess_type { - ND_READ_ACCESS = 0, - ND_WRITE_ACCESS, - ND_MODIFY_ACCESS -}; - -struct nd_map_io_space { - enum nd_mapping_type map_type; - enum nd_caching_type cache_type; - u32 cb_length; -}; - -struct nd_map_memory { - enum nd_mapping_type map_type; - enum nd_aceess_type access_type; - u64 address; - u32 cb_length; -}; - -struct nd_mapping_id { - enum nd_mapping_type map_type; - u64 id; -}; - -struct ndk_map_pages { - struct nd_map_memory header; - u32 page_offset; -}; - -union nd_mapping { - enum nd_mapping_type map_type; - struct nd_map_io_space map_io_space; - struct nd_map_memory map_memory; - struct nd_mapping_id mapping_id; - struct ndk_map_pages map_pages; -}; - -struct nd_mapping_result { - u64 id; - u64 info; -}; - -struct nd_resource_descriptor { - u64 handle; - u32 ce_mapping_results; - u32 cb_mapping_results_offset; -}; - -struct nd_handle { - u32 version; - u32 reserved; - u64 handle; -}; - -union nd_sockaddr_inet { - struct sockaddr_in ipv4; - struct sockaddr_in6 ipv6; - u16 address_family; //KYS how is this supposed to work? -}; - -struct nd_address_element { - union nd_sockaddr_inet addr; - char mac_addr[ETH_ALEN]; -}; - -struct nd_resolve_address { - u32 version; - u32 reserved; - union nd_sockaddr_inet address; -}; - -struct nd_open_adapter { - u32 version; - u32 reserved; - u32 ce_mapping_cnt; - u32 cb_mapping_offset; - u64 adapter_id; -}; - -struct nd_adapter_query { - u32 version; - u32 info_version; - u64 adapter_handle; -}; - -struct nd_create_cq { - u32 version; - u32 queue_depth; - u32 ce_mapping_cnt; - u32 cb_mapping_offset; - u64 adapter_handle; - struct group_affinity affinity; -}; - -struct nd_create_srq { - u32 version; - u32 queue_depth; - u32 ce_mapping_cnt; - u32 cb_mapping_offset; - u32 max_request_sge; - u32 notify_threshold; - u64 pd_handle; - struct group_affinity affinity; -}; - -struct nd_create_qp_hdr { - u32 version; - u32 cb_max_inline_data; - u32 ce_mapping_cnt; - u32 cb_mapping_offset; //KYS: what is this prefix - ce/cb - u32 initiator_queue_depth; - u32 max_initiator_request_sge; - u64 receive_cq_handle; - u64 initiator_cq_handle; - u64 pd_handle; -}; - -struct nd_create_qp { - struct nd_create_qp_hdr hdr; - u32 receive_queue_depth; - u32 max_receive_request_sge; -}; - -struct nd_create_qp_with_srq { - struct nd_create_qp_hdr header; - u64 srq_handle; -}; - -struct nd_srq_modify { - u32 version; - u32 queue_depth; - u32 ce_mapping_cnt; - u32 cb_mapping_offset; - u32 notify_threshold; - u32 reserved; - u64 srq_handle; -}; - -struct nd_cq_modify { - u32 version; - u32 queue_depth; - u32 ce_mapping_count; - u32 cb_mappings_offset; - u64 cq_handle; -}; - -struct nd_cq_notify { - u32 version; - u32 type; - u64 cq_handle; -}; - -struct nd_mr_register_hdr { - u32 version; - u32 flags; - u64 cb_length; - u64 target_addr; - u64 mr_handle; -}; - -struct nd_mr_register { - struct nd_mr_register_hdr header; - u64 address; -}; - -struct nd_bind { - u32 version; - u32 reserved; - u64 handle; - union nd_sockaddr_inet address; -}; - -struct nd_read_limits { - u32 inbound; - u32 outbound; -}; - -struct nd_connect { - u32 version; - u32 reserved; - struct nd_read_limits read_limits; - u32 cb_private_data_length; - u32 cb_private_data_offset; - u64 connector_handle; - u64 qp_handle; - union nd_sockaddr_inet destination_address; - struct if_physical_addr phys_addr; -}; - -struct nd_accept { - u32 version; - u32 reserved; - struct nd_read_limits read_limits; - u32 cb_private_data_length; - u32 cb_private_data_offset; - u64 connector_handle; - u64 qp_handle; -}; - -struct nd_reject { - u32 version; - u32 reserved; - u32 cb_private_data_length; - u32 cb_private_data_offset; - u64 connector_handle; -}; - -struct nd_listen { - u32 version; - u32 back_log; - u64 listener_handle; -}; - -struct nd_get_connection_request { - u32 version; - u32 reserved; - u64 listener_handle; - u64 connector_handle; -}; - -enum ndv_mmio_type { - ND_PARTITION_KERNEL_VIRTUAL, - ND_PARTITION_SYSTEM_PHYSICAL, - ND_PARTITION_GUEST_PHYSICAL, - ND_MAXIMUM_MMIO_TYPE -}; - -struct ndv_resolve_adapter_id { - u32 version; - struct if_physical_addr phys_addr; -}; - -struct ndv_partition_create { - u32 version; - enum ndv_mmio_type mmio_type; - u64 adapter_id; - u64 xmit_cap; -}; - -struct ndv_partition_bind_luid { - u32 version; - u32 reserved; - u64 partition_handle; - struct if_physical_addr phys_addr; - //IF_LUID luid; //KYS? -}; - -struct ndv_partition_bind_address { - u32 version; - u32 reserved; - u64 partition_handle; - union nd_sockaddr_inet address; - struct if_physical_addr guest_phys_addr; - struct if_physical_addr phys_addr; -}; - -struct ndk_mr_register { - struct nd_mr_register_hdr hdr; - u32 cb_logical_page_addresses_offset; -}; - -struct ndk_bind { - struct nd_bind hdr; - u64 authentication_id; - bool is_admin; -}; - -#define FDN 0x12 -#define METHOD_BUFFERED 0x0 -#define FAA 0x0 - -#define CTL_CODE( DeviceType, Function, Method, Access ) ( \ - ((DeviceType) << 16) | ((Access) << 14) | ((Function) << 2) | (Method) \ -) - -#define ND_FUNCTION(r_, i_) ((r_) << 6 | (i_)) -#define IOCTL_ND(r_, i_) \ - CTL_CODE( FDN, ND_FUNCTION((r_), (i_)), METHOD_BUFFERED, FAA ) - -#define ND_FUNCTION_FROM_CTL_CODE(ctrlCode_) ((ctrlCode_ >> 2) & 0xFFF) -#define ND_RESOURCE_FROM_CTL_CODE(ctrlCode_) (ND_FUNCTION_FROM_CTL_CODE(ctrlCode_) >> 6) -#define ND_OPERATION_FROM_CTRL_CODE(ctrlCode_) (ND_FUNCTION_FROM_CTL_CODE(ctrlCode_) & 0x3F) - -#define ND_DOS_DEVICE_NAME L"\\DosDevices\\Global\\NetworkDirect" -#define ND_WIN32_DEVICE_NAME L"\\\\.\\NetworkDirect" - -enum nd_resource_type { - ND_PROVIDER = 0, - ND_ADAPTER, - ND_PD, - ND_CQ, - ND_MR, - ND_MW, - ND_SRQ, - ND_CONNECTOR, - ND_LISTENER, - ND_QP, - ND_VIRTUAL_PARTITION, - ND_RESOURCE_TYPE_COUNT -}; - -#define ND_OPERATION_COUNT 14 - -#define IOCTL_ND_PROVIDER(i_) IOCTL_ND(ND_PROVIDER, i_) -#define IOCTL_ND_ADAPTER(i_) IOCTL_ND(ND_ADAPTER, i_) -#define IOCTL_ND_PD(i_) IOCTL_ND(ND_PD, i_) -#define IOCTL_ND_CQ(i_) IOCTL_ND(ND_CQ, i_) -#define IOCTL_ND_MR(i_) IOCTL_ND(ND_MR, i_) -#define IOCTL_ND_MW(i_) IOCTL_ND(ND_MW, i_) -#define IOCTL_ND_SRQ(i_) IOCTL_ND(ND_SRQ, i_) -#define IOCTL_ND_CONNECTOR(i_) IOCTL_ND(ND_CONNECTOR, i_) -#define IOCTL_ND_LISTENER(i_) IOCTL_ND(ND_LISTENER, i_) -#define IOCTL_ND_QP(i_) IOCTL_ND(ND_QP, i_) -#define IOCTL_ND_VIRTUAL_PARTITION(i_) IOCTL_ND(ND_VIRTUAL_PARTITION, i_) - -/* Provider IOCTLs */ -#define IOCTL_ND_PROVIDER_INIT IOCTL_ND_PROVIDER( 0 ) -#define IOCTL_ND_PROVIDER_BIND_FILE IOCTL_ND_PROVIDER( 1 ) -#define IOCTL_ND_PROVIDER_QUERY_ADDRESS_LIST IOCTL_ND_PROVIDER( 2 ) -#define IOCTL_ND_PROVIDER_RESOLVE_ADDRESS IOCTL_ND_PROVIDER( 3 ) -#define IOCTL_ND_PROVIDER_MAX_OPERATION 4 - -/* Adapter IOCTLs */ -#define IOCTL_ND_ADAPTER_OPEN IOCTL_ND_ADAPTER( 0 ) -#define IOCTL_ND_ADAPTER_CLOSE IOCTL_ND_ADAPTER( 1 ) -#define IOCTL_ND_ADAPTER_QUERY IOCTL_ND_ADAPTER( 2 ) -#define IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST IOCTL_ND_ADAPTER( 3 ) -#define IOCTL_ND_ADAPTER_MAX_OPERATION 4 - -/* Protection Domain IOCTLs */ -#define IOCTL_ND_PD_CREATE IOCTL_ND_PD( 0 ) -#define IOCTL_ND_PD_FREE IOCTL_ND_PD( 1 ) -#define IOCTL_ND_PD_MAX_OPERATION 2 - -/* Completion Queue IOCTLs */ -#define IOCTL_ND_CQ_CREATE IOCTL_ND_CQ( 0 ) -#define IOCTL_ND_CQ_FREE IOCTL_ND_CQ( 1 ) -#define IOCTL_ND_CQ_CANCEL_IO IOCTL_ND_CQ( 2 ) -#define IOCTL_ND_CQ_GET_AFFINITY IOCTL_ND_CQ( 3 ) -#define IOCTL_ND_CQ_MODIFY IOCTL_ND_CQ( 4 ) -#define IOCTL_ND_CQ_NOTIFY IOCTL_ND_CQ( 5 ) -#define IOCTL_ND_CQ_MAX_OPERATION 6 - -/* Memory Region IOCTLs */ -#define IOCTL_ND_MR_CREATE IOCTL_ND_MR( 0 ) -#define IOCTL_ND_MR_FREE IOCTL_ND_MR( 1 ) -#define IOCTL_ND_MR_CANCEL_IO IOCTL_ND_MR( 2 ) -#define IOCTL_ND_MR_REGISTER IOCTL_ND_MR( 3 ) -#define IOCTL_ND_MR_DEREGISTER IOCTL_ND_MR( 4 ) -#define IOCTL_NDK_MR_REGISTER IOCTL_ND_MR( 5 ) -#define IOCTL_ND_MR_MAX_OPERATION 6 - -/* Memory Window IOCTLs */ -#define IOCTL_ND_MW_CREATE IOCTL_ND_MW( 0 ) -#define IOCTL_ND_MW_FREE IOCTL_ND_MW( 1 ) -#define IOCTL_ND_MW_MAX_OPERATION 2 - -/* Shared Receive Queue IOCTLs */ -#define IOCTL_ND_SRQ_CREATE IOCTL_ND_SRQ( 0 ) -#define IOCTL_ND_SRQ_FREE IOCTL_ND_SRQ( 1 ) -#define IOCTL_ND_SRQ_CANCEL_IO IOCTL_ND_SRQ( 2 ) -#define IOCTL_ND_SRQ_GET_AFFINITY IOCTL_ND_SRQ( 3 ) -#define IOCTL_ND_SRQ_MODIFY IOCTL_ND_SRQ( 4 ) -#define IOCTL_ND_SRQ_NOTIFY IOCTL_ND_SRQ( 5 ) -#define IOCTL_ND_SRQ_MAX_OPERATION 6 - -/* Connector IOCTLs */ -#define IOCTL_ND_CONNECTOR_CREATE IOCTL_ND_CONNECTOR( 0 ) -#define IOCTL_ND_CONNECTOR_FREE IOCTL_ND_CONNECTOR( 1 ) -#define IOCTL_ND_CONNECTOR_CANCEL_IO IOCTL_ND_CONNECTOR( 2 ) -#define IOCTL_ND_CONNECTOR_BIND IOCTL_ND_CONNECTOR( 3 ) -#define IOCTL_ND_CONNECTOR_CONNECT IOCTL_ND_CONNECTOR( 4 ) -#define IOCTL_ND_CONNECTOR_COMPLETE_CONNECT IOCTL_ND_CONNECTOR( 5 ) -#define IOCTL_ND_CONNECTOR_ACCEPT IOCTL_ND_CONNECTOR( 6 ) -#define IOCTL_ND_CONNECTOR_REJECT IOCTL_ND_CONNECTOR( 7 ) -#define IOCTL_ND_CONNECTOR_GET_READ_LIMITS IOCTL_ND_CONNECTOR( 8 ) -#define IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA IOCTL_ND_CONNECTOR( 9 ) -#define IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS IOCTL_ND_CONNECTOR( 10 ) -#define IOCTL_ND_CONNECTOR_GET_ADDRESS IOCTL_ND_CONNECTOR( 11 ) -#define IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT IOCTL_ND_CONNECTOR( 12 ) -#define IOCTL_ND_CONNECTOR_DISCONNECT IOCTL_ND_CONNECTOR( 13 ) -#define IOCTL_ND_CONNECTOR_MAX_OPERATION 14 - -/* Listener IOCTLs */ -#define IOCTL_ND_LISTENER_CREATE IOCTL_ND_LISTENER( 0 ) -#define IOCTL_ND_LISTENER_FREE IOCTL_ND_LISTENER( 1 ) -#define IOCTL_ND_LISTENER_CANCEL_IO IOCTL_ND_LISTENER( 2 ) -#define IOCTL_ND_LISTENER_BIND IOCTL_ND_LISTENER( 3 ) -#define IOCTL_ND_LISTENER_LISTEN IOCTL_ND_LISTENER( 4 ) -#define IOCTL_ND_LISTENER_GET_ADDRESS IOCTL_ND_LISTENER( 5 ) -#define IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST IOCTL_ND_LISTENER( 6 ) -#define IOCTL_ND_LISTENER_MAX_OPERATION 7 - -/* Queue Pair IOCTLs */ -#define IOCTL_ND_QP_CREATE IOCTL_ND_QP( 0 ) -#define IOCTL_ND_QP_CREATE_WITH_SRQ IOCTL_ND_QP( 1 ) -#define IOCTL_ND_QP_FREE IOCTL_ND_QP( 2 ) -#define IOCTL_ND_QP_FLUSH IOCTL_ND_QP( 3 ) -#define IOCTL_ND_QP_MAX_OPERATION 4 - -/* Kernel-mode only IOCTLs (IRP_MJ_INTERNAL_DEVICE_CONTROL) */ -#define IOCTL_NDV_PARTITION_RESOLVE_ADAPTER_ID IOCTL_ND_VIRTUAL_PARTITION( 0 ) -#define IOCTL_NDV_PARTITION_CREATE IOCTL_ND_VIRTUAL_PARTITION( 1 ) -#define IOCTL_NDV_PARTITION_FREE IOCTL_ND_VIRTUAL_PARTITION( 2 ) -#define IOCTL_NDV_PARTITION_BIND IOCTL_ND_VIRTUAL_PARTITION( 3 ) -#define IOCTL_NDV_PARTITION_UNBIND IOCTL_ND_VIRTUAL_PARTITION( 4 ) -#define IOCTL_NDV_PARTITION_BIND_LUID IOCTL_ND_VIRTUAL_PARTITION( 5 ) -#define IOCTL_NDV_PARTITION_MAX_OPERATION 6 - - -#define MB_SHIFT 20 - - -/* Ringbuffer size for the channel */ -#define NDV_NUM_PAGES_IN_RING_BUFFER 64 - -#define NDV_MAX_PACKETS_PER_RECEIVE 8 - -#define NDV_MAX_PACKET_COUNT 16304 - -#define NDV_MAX_NUM_OUTSTANDING_RECEIVED_PACKETS (16304) -#define NDV_MAX_HANDLE_TABLE_SIZE (16304) -#define NDV_HOST_MAX_HANDLE_TABLE_SIZE (NDV_MAX_HANDLE_TABLE_SIZE * 16) - - -#define NDV_MAX_MAPPINGS 4 - -#define NDV_STATE_NONE 0x00000000 -#define NDV_STATE_CREATED 0x00000001 -#define NDV_STATE_CONNECTING 0x00000002 -#define NDV_STATE_INITIALIZING 0x00000003 -#define NDV_STATE_OPERATIONAL 0xEFFFFFFF -#define NDV_STATE_FAILED 0xFFFFFFFF - - -#define NDV_MAX_PRIVATE_DATA_SIZE 64 -#define NDV_MAX_IOCTL_SIZE 256 - -/* max size of buffer for vector of ND_MAPPING */ -#define NDV_MAX_MAPPING_BUFFER_SIZE \ - (NDV_MAX_MAPPINGS * sizeof(union nd_mapping)) - -/* max expected ioctl buffer size from users */ -#define NDV_MAX_IOCTL_BUFFER_SIZE \ - (NDV_MAX_IOCTL_SIZE + \ - NDV_MAX_MAPPING_BUFFER_SIZE + \ - NDV_MAX_PRIVATE_DATA_SIZE) - -/* max PFN array for inline buffers */ -#define NDV_MAX_INLINE_PFN_ARRAY_LENGTH 32 - -/* Field header size for inline buffer */ -#define NDV_MAX_MAPPING_PACKET_FILED_BUFFER_SIZE \ - (NDV_MAX_MAPPINGS * sizeof(NDV_PACKET_FIELD)) - -/* Max for a single field */ - -#define NDV_MAX_SINGLE_MAPPING_FIELD ( sizeof(GPA_RANGE) + \ - (sizeof(PFN_NUMBER) * NDV_MAX_INLINE_PFN_ARRAY_LENGTH)) - -/* Max for all inine data */ - -#define NDV_MAX_MAPPING_DATA_SIZE (NDV_MAX_MAPPING_PACKET_FILED_BUFFER_SIZE + \ - (NDV_MAX_MAPPINGS * NDV_MAX_SINGLE_MAPPING_FIELD)) - - -#define NDV_MAX_PACKET_HEADER_SIZE 256 - -#define NDV_MAX_PACKET_SIZE (NDV_MAX_PACKET_HEADER_SIZE + \ - NDV_MAX_IOCTL_BUFFER_SIZE + \ - NDV_MAX_MAPPING_DATA_SIZE) - -/* Well known message type INIT is defined for the channel - * not for the protocol. - */ - -#define NDV_PACKET_TYPE_INIT 0xFFFFFFFF - -/* Invalid protocol version to to identify uninitialized channels */ - -#define NDV_PROTOCOL_VERSION_INVALID 0xFFFFFFFF - -/* Flags that control the bahavior of packet handling */ - -enum ndv_packet_options { - NDV_PACKET_OPTION_NONE = 0x00, - - /* Indicates that the ExternalDataMdl parameter is expectected to be - * passed and must be handled in the reciever. This call must be - * handled specially to ensure that the MDL can be created correctly. - */ - NDV_PACKET_OPTION_EXTERNAL_DATA = 0x01, - - /* Inicates that the reciever must execution the handler at passive. */ - NDV_PACKET_OPTIONS_REQUIRES_PASSIVE = 0x02, - - /* Indicates that the sender does not expect and is not waiting for a - * response packet. - */ - NDV_PACKET_OPTIONS_POST = 0x04, -}; - -#define NDV_PACKET_TYPE(id_, opt_) \ - (((opt_)<<24) | (id_)) - -#define NDV_PACKET_TYPE_OPTIONS(type_) \ - (((type_) >> 24) & 0xFF) - -#define NDV_PACKET_TYPE_ID(type_) \ - ((type_) & 0xFFFFFF) \ - -#define NDV_ADD_PACKET_OPTION(type_, opt_) \ - (type_) |= (opt_<<24) - -/* The header value sent on all packets */ -union ndv_packet_hdr { - - struct { - /* The type of packet. - * This value should be created with the NDV_PACKET_TYPE macro - * to include all packet options within the packet type. - */ - u32 packet_type; - /* The size of the entire fixed message structure that exists - * before the data. This must be >= sizeof(NDV_PACKET_HEADER) - */ - u32 hdr_sz; - /* This size of the data that follows the message - * data_sz + hdr_sz size gives the total size of - * the buffer that is used. - */ - u32 data_sz; - /* The status code used to indicate success or failure. - * It is only used in completions and during responses. - */ - u32 status; //KYS: NTSTATUS? - }; - - u64 padding[2]; //KYS: why? -}; - - -/* The core INIT packet. This message is defined in the channel - * not in the protocol. This message should never change size - * or behavior, as it could impact compatibility in the future. - * This packet is used to negotiate the protocol version, so chaning - * this size could break backward compat. - */ - -union ndv_packet_init { - struct { - u32 packet_type; - u32 protocol_version; - u32 flags; - }; - u64 padding[2]; -} __packed; - -#define NDV_PACKET_INIT_SIZE 16 - -/* Data packing flags used for accessing the dynamic fields inside a packet */ -#define NDV_DATA_PACKING_2 0x1 -#define NDV_DATA_PACKING_4 0x3 -#define NDV_DATA_PACKING_8 0x7 - - -#define NDV_PROTOCOL_VERSION_1 0x0100 -#define NDV_PROTOCOL_VERSION_CURRENT NDV_PROTOCOL_VERSION_1 -#define NDV_PROTOCOL_VERSION_COUNT 1 - -struct ndv_pkt_field { - u32 size; - u32 offset; -}; - -enum ndv_pkt_id { - NDV_PKT_UNKNOWN = 0, - /* Version 1 Message ID's */ - NDV_PKT_ID1_BIND, - NDV_PKT_ID1_CREATE, - NDV_PKT_ID1_CLEANUP, - NDV_PKT_ID1_CANCEL, - NDV_PKT_ID1_CONTROL, - NDV_PKT_ID1_COMPLETE, - NDV_PKT_ID1_INIT_RESOURCES, -}; - -/* The guest will send this as the first messages just after init - * The resources are reserved per channel. - */ - -struct ndv_pkt_hdr_init_resources_1 { - - union ndv_packet_hdr pkt_hdr; - u16 io_space_sz_mb; - u64 io_space_start; - -}; - - - -/* The guest will send this packet to the host after channel init - * to query support for the adapters that are registered. - */ - -struct ndv_pkt_hdr_bind_1 { - union ndv_packet_hdr pkt_hdr; - bool unbind; - union nd_sockaddr_inet ip_address; - struct if_physical_addr phys_addr; - u64 guest_id; -}; - -union ndv_context_handle { - u64 val64; - struct { - u32 local; - u32 remote; - }; -}; - -struct ndv_pkt_hdr_create_1 { - union ndv_packet_hdr pkt_hdr; - - /* Identifies the object used to track this file handle on both - * the guest and the host. When sent from the guest, it will contain - * the guest handle. On success, the host will populate and return - * it's handle value as well. - */ - - union ndv_context_handle handle; - - /* The parameters sent to the CreateFile call */ - u32 access_mask; - u32 open_options; - - u16 file_attributes; //KYS: This field must be 64 bit aligned - - u16 share_access; //KYS - - u32 kys_padding; //KYS - - u16 ea_length; //KYS; needs to be 64 bit aligned; what is ea length - unused -}; - - -struct ndv_pkt_hdr_cleanup_1 { - union ndv_packet_hdr pkt_hdr; - - /* Identifies the object used to track this file handle on both - * the guest and the host. When sent from the guest, it will contain - * the both the guest and host handle values. The host will use this - * value to cleanup its resource, then update its portion of the handle - * to NDV_HANDLE_NULL before returning the data back to the guest. - */ - union ndv_context_handle handle; -}; - -struct ndv_pkt_hdr_cancel_1 { - union ndv_packet_hdr pkt_hdr; - union ndv_context_handle file_handle; - union ndv_context_handle irp_handle; -}; - -struct ndv_bind_port_info { - //LUID authentication_id; //KYS: LUID? - bool is_admin; -}; - -struct ndv_extended_data_flds { - union { - u32 field_count; - u64 padding; - }; - - //struct ndv_pkt_field fields[ANYSIZE_ARRAY]; //KYS? -}; - - -struct ndv_packet_hdr_control_1 { - union ndv_packet_hdr pkt_hdr; - /* Identifies the object used to track this file handle on both - * the guest and the host. This should always have both guest - * and host handle values inside it. - */ - - union ndv_context_handle file_handle; - - /* The handle information for the allocated irp context object. - * This information is used when the host/guest starts the cancelation - */ - union ndv_context_handle irp_handle; - - /* The input data describing in the IO control parameters */ - - u32 io_cntrl_code; - u32 output_buf_sz; - u32 input_buf_sz; - u32 input_output_buf_offset; - - /* These are used in the return message to indicate the status of the IO - * operation and the amount of data written to the output buffer. - */ - u32 io_status; //KYS: NTSTATUS? - u32 bytes_returned; - - /* This contains the field information for additional data that is sent - * with the packet that is IOCTL specific. - */ - - struct ndv_pkt_field extended_data; -}; - -/* - * Include MLX specific defines. - */ - -#include "mx_abi.h" - -/* Driver specific state. - */ - -/* - * We need to have host open a file; some - * Windows constants for open. - */ -#define STANDARD_RIGHTS_ALL (0x001F0000L) -#define FILE_ATTRIBUTE_NORMAL (0x80) -#define FILE_SHARE_READ (0x00000001) -#define FILE_SHARE_WRITE (0x00000002) -#define FILE_SHARE_DELETE (0x00000004) -#define FILE_FLAG_OVERLAPPED (0x40000000) -#define FILE_SHARE_ALL (FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE) -#define CREATE_ALWAYS (2) -#define OPEN_EXISTING (3) - -#define RTL_NUMBER_OF(_x) \ - sizeof(_x)/sizeof(_x[0]) -/* - * The context structure tracks the open state. - */ - -/* - * Packet layout for open adaptor. - */ - -/* - * Packet for querying the address list. - */ - -union query_addr_list_ioctl { - struct nd_handle in; - union nd_sockaddr_inet out[16]; //KYS a max of 16 addresses -}; - -struct pkt_query_addr_list { - struct ndv_packet_hdr_control_1 hdr; - union query_addr_list_ioctl ioctl; - unsigned long activity_id; -}; - - -struct pkt_fld { - u32 size; - u32 offset; -}; - -struct fld_data { - union { - u64 padding; - }; -}; - -struct extended_data_oad { - union { - u32 cnt; - u64 padding; - }; - /* offsets are from start of extended data struct - * and should start on 8 byte boundary - */ - struct pkt_fld fields[IBV_GET_CONTEXT_MAPPING_MAX]; -}; - -union oad_ioctl { - struct nd_open_adapter input; - struct nd_resource_descriptor resrc_desc; -}; - -union oad_mappings { - struct ibv_get_context_req ctx_input; - struct ibv_get_context_resp ctx_output; -}; - -struct pkt_nd_open_adapter { - struct ndv_packet_hdr_control_1 hdr; - - union oad_ioctl ioctl; - union oad_mappings mappings; - - /* - * Extended data. - */ - struct extended_data_oad ext_data; -}; - -/* - * Create CQ IOCTL. - */ - -struct cq_db_gpa { - u32 byte_count; - u32 byte_offset; - u64 pfn_array[2]; -}; - -struct cq_sn_gpa { - u32 byte_count; - u32 byte_offset; - u64 pfn_array[2]; -}; - -struct create_cq_ext_data { - union { - u32 cnt; - u64 padding; - }; - /* offsets are from start of extended data struct - * and should start on 8 byte boundary - */ - struct pkt_fld fields[MLX4_IB_CREATE_CQ_MAPPING_MAX]; - struct cq_db_gpa db_gpa; - struct cq_sn_gpa sn_gpa; - struct gpa_range cqbuf_gpa; -}; - -union create_cq_ioctl { - struct nd_create_cq input; - struct nd_resource_descriptor resrc_desc; -}; - -union create_cq_mappings { - struct ibv_create_cq cq_in; - struct ibv_create_cq_resp cq_resp; -}; - -struct pkt_nd_create_cq { - struct ndv_packet_hdr_control_1 hdr; - - union create_cq_ioctl ioctl; - union create_cq_mappings mappings; - - /* - * Extended data. - */ - struct create_cq_ext_data ext_data; -}; - -/* - * IOCTL to free CQ. - */ -struct free_cq_ioctl { - struct nd_handle in; -}; - -struct pkt_nd_free_cq { - struct ndv_packet_hdr_control_1 hdr; - - struct free_cq_ioctl ioctl; -}; - - -/* - * IOCTL to QUERY CQ - CQ NOTIFY - */ - -struct notify_cq_ioctl { - struct nd_cq_notify in; -}; - -struct pkt_nd_notify_cq { - struct ndv_packet_hdr_control_1 hdr; - struct notify_cq_ioctl ioctl; -}; - -/* - * IOCTL to Create a listner - */ - -struct nd_ep_create { - struct nd_handle hdr; - bool to_semantics; - unsigned long activity_id; -}; - -union listener_cr_ioctl { - struct nd_ep_create in; - u64 out; -}; - -struct pkt_nd_cr_listener { - struct ndv_packet_hdr_control_1 hdr; - union listener_cr_ioctl ioctl; -}; - -/* - * IOCTL to free listener. - */ - -struct listener_free_ioctl { - struct nd_handle in; -}; - -struct pkt_nd_free_listener { - struct ndv_packet_hdr_control_1 hdr; - struct listener_free_ioctl ioctl; -}; - -/* - * IOCTL for listener cancel IO. - */ -struct listener_cancelio_ioctl { - struct nd_handle in; -}; - -struct pkt_nd_cancelio_listener { - struct ndv_packet_hdr_control_1 hdr; - struct listener_cancelio_ioctl ioctl; -}; - -/* - * IOCTL for LISTENER BIND - */ - -union listener_bind_ioctl { - struct ndk_bind in; -}; - -struct pkt_nd_bind_listener { - struct ndv_packet_hdr_control_1 hdr; - union listener_bind_ioctl ioctl; -}; - -/* - * After the listener is bound, enable - * listening. - */ - -union listener_listen_ioctl { - struct nd_listen in; -}; - -struct pkt_nd_listen_listener { - struct ndv_packet_hdr_control_1 hdr; - union listener_listen_ioctl ioctl; -}; - -/* - * IOCTL for getting the adddress from listener. - * - */ - -union listener_get_addr_ioctl { - struct nd_handle in; - union nd_sockaddr_inet out; -}; - -struct pkt_nd_get_addr_listener { - struct ndv_packet_hdr_control_1 hdr; - union listener_get_addr_ioctl ioctl; -}; - -/* - * IOCTL to get a connection from a listener. - */ - -union listener_get_connection_ioctl { - struct nd_get_connection_request in; - union nd_sockaddr_inet out; -}; - -struct pkt_nd_get_connection_listener { - struct ndv_packet_hdr_control_1 hdr; - union listener_get_connection_ioctl ioctl; -}; - - -/* - * Connector IOCTLs - */ - -/* - * IOCTL to create connector. - */ - -union connector_cr_ioctl { //KYS should this be a union or struct? - struct nd_ep_create in; - u64 out; -}; - -struct pkt_nd_cr_connector { - struct ndv_packet_hdr_control_1 hdr; - union connector_cr_ioctl ioctl; //KYS: union or struct -}; - -/* - * IOCTL to free connector. - */ - -struct connector_free_ioctl { - struct nd_handle in; -}; - -struct pkt_nd_free_connector { - struct ndv_packet_hdr_control_1 hdr; - struct connector_free_ioctl ioctl; -}; - -/* - * IOCTL to cancel I/O on a connector. - */ - -struct connector_cancelio_ioctl { - struct nd_handle in; -}; - -struct pkt_nd_cancelio_connector { - struct ndv_packet_hdr_control_1 hdr; - struct connector_cancelio_ioctl ioctl; -}; - -/* - * IOCTL to Bind an address to the connector. - */ - -union connector_bind_ioctl { - struct ndk_bind in; -}; - -struct pkt_nd_bind_connector { - struct ndv_packet_hdr_control_1 hdr; - union connector_bind_ioctl ioctl; -}; - -/* - * IOCTL to connect a connector. - */ - -struct connector_connect_in { - struct nd_connect hdr; - u8 retry_cnt; - u8 rnr_retry_cnt; - u8 priv_data[56]; - unsigned long activity_id; -}; - -union connector_connect_ioctl { - struct connector_connect_in in; -}; - -struct pkt_nd_connector_connect { - struct ndv_packet_hdr_control_1 hdr; - union connector_connect_ioctl ioctl; -}; - -/* - * IOCTL for connector complete connect - */ - -struct complete_connect_in { - struct nd_handle hdr; - u8 rnr_nak_to; - unsigned long activity_id; -}; - -struct complete_connect_out { - enum ibv_qp_state state; -}; - -union connector_complete_connect_ioctl { - struct complete_connect_in in; - struct complete_connect_out out; -}; - -struct pkt_nd_connector_connect_complete { - struct ndv_packet_hdr_control_1 hdr; - union connector_complete_connect_ioctl ioctl; -}; - - -#define MAX_PRIVATE_DATA_LEN 148 - -/* - * IOCTL for connector accept. - */ - -struct connector_accept_in { - struct nd_accept hdr; - u8 rnr_retry_cnt; - u8 rnr_nak_to; - u8 private_data[MAX_PRIVATE_DATA_LEN]; - unsigned long activity_id; -}; - -struct connector_accept_out { - enum ibv_qp_state state; -}; - -union connector_accept_ioctl { - struct connector_accept_in in; - struct connector_accept_out out; -}; - -struct pkt_nd_connector_accept { - struct ndv_packet_hdr_control_1 hdr; - union connector_accept_ioctl ioctl; -}; - -/* - * IOCTL for connector to reject a connection. - */ - -struct connector_reject_in { - struct nd_reject hdr; - u8 private_data[MAX_PRIVATE_DATA_LEN]; -}; - -struct connector_reject_out { - enum ibv_qp_state state; -}; - -union connector_reject_ioctl { - struct connector_reject_in in; - struct connector_reject_out out; -}; - -struct pkt_nd_connector_reject { - struct ndv_packet_hdr_control_1 hdr; - union connector_reject_ioctl ioctl; -}; - -/* - * IOCTL to get connector read limits. - */ - -struct connector_get_rd_limits_in { - struct nd_handle in; -}; - -struct connector_get_rd_limits_out { - struct nd_read_limits out; -}; - -union connector_get_rd_limits_ioctl { - struct connector_get_rd_limits_in in; - struct connector_get_rd_limits_out out; -}; - -struct pkt_nd_connector_get_rd_limits { - struct ndv_packet_hdr_control_1 hdr; - union connector_get_rd_limits_ioctl ioctl; -}; - -/* - * IOCTL to get connector private data. - */ -union connector_get_priv_data_ioctl { - struct nd_handle in; - u8 out[MAX_PRIVATE_DATA_LEN]; -}; - -struct pkt_nd_connector_get_priv_data { - struct ndv_packet_hdr_control_1 hdr; - union connector_get_priv_data_ioctl ioctl; -}; - - -/* - * IOCTL get peer address. - */ - -union connector_get_peer_addr_ioctl { - struct nd_handle in; - union nd_sockaddr_inet out; -}; - -struct pkt_nd_connector_get_peer_addr { - struct ndv_packet_hdr_control_1 hdr; - union connector_get_peer_addr_ioctl ioctl; -}; - -/* - * IOCTL to get connector address. - */ - -union connector_get_addr_ioctl { - struct nd_handle in; - union nd_sockaddr_inet out; -}; - -struct pkt_nd_connector_get_addr { - struct ndv_packet_hdr_control_1 hdr; - union connector_get_addr_ioctl ioctl; -}; - -/* - * IOCTL for disconnect notification. - */ - -union connector_notify_disconnect_ioctl { - struct nd_handle in; -}; - -struct pkt_nd_connector_notify_disconnect { - struct ndv_packet_hdr_control_1 hdr; - union connector_notify_disconnect_ioctl ioctl; -}; - -union connector_disconnect_ioctl { - struct nd_handle in; -}; - -struct pkt_nd_connector_disconnect { - struct ndv_packet_hdr_control_1 hdr; - union connector_notify_disconnect_ioctl ioctl; -}; - -/* - * IOCTLs for QP operations. - */ - -/* - * Create qp IOCTL. - */ - -struct qp_db_gpa { - u32 byte_count; - u32 byte_offset; - u64 pfn_array[1]; -}; - -struct create_qp_ext_data { - union { - u32 cnt; - u64 padding; - }; - /* offsets are from start of extended data struct - * and should start on 8 byte boundary - */ - struct pkt_fld fields[MLX4_IB_CREATE_QP_MAPPINGS_MAX]; - struct qp_db_gpa db_gpa; - struct gpa_range qpbuf_gpa; -}; - -union create_qp_ioctl { - struct nd_create_qp input; - struct nd_resource_descriptor resrc_desc; -}; - -union create_qp_mappings { - struct ibv_create_qp qp_in; - struct ibv_create_qp_resp qp_resp; -}; - -struct pkt_nd_create_qp { - struct ndv_packet_hdr_control_1 hdr; - - union create_qp_ioctl ioctl; - union create_qp_mappings mappings; - - /* - * Extended data. - */ - struct create_qp_ext_data ext_data; -}; - -/* - * IOCTL to flush a QP. - */ -struct flush_qp_ioctl { - struct nd_handle in; - enum ibv_qp_state out; -}; - -struct pkt_nd_flush_qp { - struct ndv_packet_hdr_control_1 hdr; - struct flush_qp_ioctl ioctl; -}; - -/* - * Memory Region IOCTLS - */ -union create_mr_ioctl { - struct nd_handle in; - u64 out; -}; - -struct pkt_nd_create_mr { - struct ndv_packet_hdr_control_1 hdr; - union create_mr_ioctl ioctl; -}; - -struct mr_out { - u32 lkey; - u32 rkey; - unsigned long activity_id; -}; - - -union register_mr_ioctl { - struct nd_mr_register in; - struct mr_out out; -}; - -struct pkt_nd_register_mr { - struct ndv_packet_hdr_control_1 hdr; - union register_mr_ioctl ioctl; -}; - -struct deregister_mr_ioctl { - struct nd_handle in; -}; - -struct pkt_nd_deregister_mr { - struct ndv_packet_hdr_control_1 hdr; - struct deregister_mr_ioctl ioctl; -}; - -/* - * IOCTL to disconnect connector - */ - -/* - * Create PD IOCTL. - */ -struct nd_create_pd_ioctl { - union { - struct nd_handle in; - u64 out_handle; - }; - struct ibv_alloc_pd_resp resp; -}; - -struct pkt_nd_pd_create { - struct ndv_packet_hdr_control_1 hdr; - struct nd_create_pd_ioctl ioctl; -}; - -/* - * Free Handle. Check the layout with Luke. - * - */ -struct free_handle_ioctl { - struct nd_handle in; -}; - -struct pkt_nd_free_handle { - struct ndv_packet_hdr_control_1 hdr; - struct free_handle_ioctl ioctl; -}; - -/* - * Cancel I/O. - */ - -struct cancel_io_ioctl { - struct nd_handle in; -}; - -struct pkt_nd_cancel_io { - struct ndv_packet_hdr_control_1 hdr; - struct cancel_io_ioctl ioctl; -}; - -/* - * Connector states: - */ - -enum connector_state { - HVND_CON_INCOMING, - HVND_CON_INCOMING_ESTABLISHED, - HVND_CON_INCOMING_REJECTED, - HVND_CON_OUTGOING_REQUEST -}; - - -/* - * Adaptor query IOCTL. - */ -struct nd_adap_query_ioctl { - union { - struct nd_adapter_query ad_q; - struct adapter_info_v2 ad_info; - }; -}; - -struct pkt_nd_query_adaptor { - struct ndv_packet_hdr_control_1 hdr; - struct nd_adap_query_ioctl ioctl; -}; - -struct nd_ioctl { - union { - struct nd_handle handle; - u8 raw_buffer[NDV_MAX_IOCTL_BUFFER_SIZE]; - }; -}; - -struct pkt_nd_provider_ioctl { - struct ndv_packet_hdr_control_1 hdr; - struct nd_ioctl ioctl; -}; - -struct hvnd_ib_pd { - struct ib_pd ibpd; - u32 pdn; - u64 handle; -}; - -struct hvnd_work { - struct work_struct work; - void *callback_arg; -}; - -struct hvnd_disconnect_work { - struct work_struct work; - int status; - void *callback_arg; -}; - -/* -struct hvnd_delayed_work { - struct delayed_work work; - void *callback_arg; -}; -*/ - -enum hvnd_cm_state { - hvnd_cm_idle = 0, - hvnd_cm_connect_reply_sent, //active - hvnd_cm_connect_reply_refused, - hvnd_cm_connect_received, //active - hvnd_cm_connect_request_sent, //passive - hvnd_cm_accept_sent, - hvnd_cm_close_sent, - hvnd_cm_established_sent, -}; - -struct incoming_pkt { - struct list_head list_entry; - char pkt[0]; -}; - -struct hvnd_ep_obj { -/* - spinlock_t ep_lk; - bool to_be_destroyed; - bool io_outstanding; - - wait_queue_head_t wait; - bool stopped; - atomic_t process_refcnt; // how many NDV_PKT_ID1_COMPLETE packets we are currently processing -*/ - bool stopping; - wait_queue_head_t wait_pending; - atomic_t nr_requests_pending; - - enum nd_resource_type type; - enum connector_state state; //KYS need to look at locking - struct iw_cm_id *cm_id; - enum hvnd_cm_state cm_state; - struct completion block_event; - struct completion disconnect_event; - struct completion connector_accept_event; - int connector_accept_status; - u64 ep_handle; - spinlock_t incoming_pkt_list_lock; - struct list_head incoming_pkt_list; - struct hvnd_ep_obj *parent; - struct hvnd_dev *nd_dev; - struct hvnd_ucontext *uctx; - struct hvnd_work wrk; - struct hvnd_cq *cq; - u8 ord; - u8 ird; - char priv_data[MAX_PRIVATE_DATA_LEN]; - bool incoming; - atomic_t disconnect_notified; - u64 outstanding_handle; - u32 local_irp; - struct hvnd_ep_obj *outstanding_ep; - struct pkt_nd_connector_connect connector_connect_pkt; - int connector_connect_retry; -}; - -struct hvnd_ucontext { - struct ib_ucontext ibucontext; - struct list_head listentry; - struct ndv_pkt_hdr_create_1 create_pkt; - struct ndv_pkt_hdr_create_1 create_pkt_ovl; /* Overlap handle */ - struct pkt_nd_provider_ioctl pr_init_pkt; - union ndv_context_handle file_handle; - union ndv_context_handle file_handle_ovl; - - struct pkt_nd_open_adapter o_adap_pkt; - - u64 adaptor_hdl; - - /* - * Protection domain state. - */ - struct pkt_nd_pd_create pd_cr_pkt; - - u64 uar_base; - u64 bf_base; - u32 bf_buf_size; - u32 bf_offset; - u32 cqe_size; - u32 max_qp_wr; - u32 max_sge; - u32 max_cqe; - u32 num_qps; - - /* - * State to manage dorbell pages: - */ - struct list_head db_page_list; - struct mutex db_page_mutex; - - atomic_t refcnt; - -}; - -struct hvnd_dev { - struct ib_device ibdev; - struct hv_device *hvdev; - u32 device_cap_flags; - unsigned char nports; - bool ib_active; - - /* State to manage interaction with the host. - */ - - spinlock_t uctxt_lk; - struct list_head listentry; - - unsigned long mmio_sz; - unsigned long mmio_start_addr; - struct resource mmio_resource; - void *mmio_virt; - - unsigned long negotiated_version; - union ndv_packet_init init_pkt; - struct ndv_pkt_hdr_init_resources_1 resources; - struct ndv_pkt_hdr_bind_1 bind_pkt; - - struct ndv_pkt_hdr_create_1 global_create_pkt; - union ndv_context_handle global_file_handle; - - struct semaphore query_pkt_sem; - bool query_pkt_set; - struct pkt_nd_query_adaptor query_pkt; - - /* - * ID tables. - */ - spinlock_t id_lock; - - struct idr cqidr; - struct idr qpidr; - struct idr mmidr; - struct idr irpidr; - struct idr uctxidr; - atomic_t open_cnt; - - char ip_addr[4]; - char mac_addr[6]; - struct completion addr_set; - int bind_complete; - struct mutex bind_mutex; -}; - -struct hvnd_cq { - struct ib_cq ibcq; - void *cq_buf; - void *db_addr; - u32 arm_sn; - u32 entries; - - u32 cqn; - u32 cqe; - u64 cq_handle; - - struct ib_umem *umem; - struct ib_umem *db_umem; - struct mlx4_ib_user_db_page user_db_page; - struct hvnd_ucontext *uctx; - struct hvnd_ep_obj ep_object; //KYS need to clean this up; have a cq irp state - bool monitor; - bool upcall_pending; -}; - -struct hvnd_qp { - struct ib_qp ibqp; - void *qp_buf; - void *db_addr; - u32 buf_size; - u8 port; - struct hvnd_dev *nd_dev; - - __u8 log_sq_bb_count; - __u8 log_sq_stride; - __u8 sq_no_prefetch; - - int rq_wqe_cnt; - int rq_wqe_shift; - int rq_max_gs; - - int sq_wqe_cnt; - int sq_wqe_shift; - int sq_max_gs; - - u32 max_inline_data; - - u32 initiator_q_depth; - u32 initiator_request_sge; - - u32 receive_q_depth; - u32 receive_request_sge; - - struct hvnd_cq *recv_cq; - struct hvnd_cq *send_cq; - - u64 receive_cq_handle; - u64 initiator_cq_handle; - u64 pd_handle; - - u64 qp_handle; - u32 qpn; - u32 max_send_wr; - u32 max_recv_wr; - u32 max_send_sge; - u32 max_recv_sge; - - struct ib_umem *umem; - struct ib_umem *db_umem; - struct mlx4_ib_user_db_page user_db_page; - struct hvnd_ucontext *uctx; - struct iw_cm_id *cm_id; - - /* - * Current QP state; need to look at locking. - * XXXKYS - */ - enum ib_qp_state qp_state; - bool cq_notify; - wait_queue_head_t wait; - atomic_t refcnt; - struct hvnd_ep_obj *connector; -}; - -struct hvnd_mr { - struct ib_mr ibmr; - struct hvnd_ib_pd *pd; - struct ib_umem *umem; - u64 start; - u64 length; - u64 virt; - int acc; - u64 mr_handle; - u32 mr_lkey; - u32 mr_rkey; -}; - -struct hvnd_cookie { - struct completion host_event; - void *pkt; -}; - -/* - * Definitions to retrieve the IP address. - */ - -#define HVND_CURRENT_VERSION 0 - -struct hvnd_ipaddr_tuple { - char mac_address[ETH_ALEN]; - struct sockaddr addr; -}; - -struct hvnd_msg { - int status; - struct hvnd_ipaddr_tuple ip_tuple; -}; - -static inline struct hvnd_ib_pd *to_nd_pd(struct ib_pd *pd) -{ - return container_of(pd, struct hvnd_ib_pd, ibpd); -} - -static inline struct hvnd_dev *to_nd_dev(struct ib_device *ibdev) -{ - return container_of(ibdev, struct hvnd_dev, ibdev); -} - -static inline struct hvnd_cq *to_nd_cq(struct ib_cq *ibcq) -{ - return container_of(ibcq, struct hvnd_cq, ibcq); -} - -static inline struct hvnd_qp *to_nd_qp(struct ib_qp *ibqp) -{ - return container_of(ibqp, struct hvnd_qp, ibqp); -} - -static inline struct hvnd_ucontext *to_nd_context(struct ib_ucontext *ibucontext) -{ - return container_of(ibucontext, struct hvnd_ucontext, ibucontext); -} - -static inline struct hvnd_ucontext *get_uctx_from_pd(struct ib_pd *pd) -{ - return to_nd_context(pd->uobject->context); -} - -static inline struct hvnd_mr *to_nd_mr(struct ib_mr *ibmr) -{ - return container_of(ibmr, struct hvnd_mr, ibmr); -} -/* - * ID management. - */ - -static inline int insert_handle(struct hvnd_dev *dev, struct idr *idr, - void *handle, u32 id) -{ - int ret; - unsigned long flags; - - idr_preload(GFP_KERNEL); - spin_lock_irqsave(&dev->id_lock, flags); - - ret = idr_alloc(idr, handle, id, id + 1, GFP_ATOMIC); - - spin_unlock_irqrestore(&dev->id_lock, flags); - idr_preload_end(); - - BUG_ON(ret == -ENOSPC); - return ret < 0 ? ret : 0; -} - -static inline void remove_handle(struct hvnd_dev *dev, struct idr *idr, u32 id) -{ - unsigned long flags; - - spin_lock_irqsave(&dev->id_lock, flags); - idr_remove(idr, id); - spin_unlock_irqrestore(&dev->id_lock, flags); -} - -static inline struct hvnd_cq *get_cqp(struct hvnd_dev *dev, u32 cqid) -{ - struct hvnd_cq *cqp; - unsigned long flags; - - spin_lock_irqsave(&dev->id_lock, flags); - cqp = idr_find(&dev->cqidr, cqid); - spin_unlock_irqrestore(&dev->id_lock, flags); - - return cqp; -} - -static inline struct hvnd_qp *get_qpp(struct hvnd_dev *dev, u32 qpid) -{ - struct hvnd_qp *qpp; - unsigned long flags; - - spin_lock_irqsave(&dev->id_lock, flags); - qpp = idr_find(&dev->qpidr, qpid); - spin_unlock_irqrestore(&dev->id_lock, flags); - - return qpp; -} - -static inline struct hvnd_ucontext *get_uctx(struct hvnd_dev *dev, u32 pid) -{ - struct hvnd_ucontext *uctx; - unsigned long flags; - - spin_lock_irqsave(&dev->id_lock, flags); - uctx = idr_find(&dev->uctxidr, pid); - spin_unlock_irqrestore(&dev->id_lock, flags); - - return uctx; -} - - -static inline void *map_irp_to_ctx(struct hvnd_dev *nd_dev, u32 irp) -{ - void *ctx; - unsigned long flags; - - spin_lock_irqsave(&nd_dev->id_lock, flags); - ctx = idr_find(&nd_dev->irpidr, irp); - spin_unlock_irqrestore(&nd_dev->id_lock, flags); - - return ctx; -} - - - -void hvnd_callback(void *context); -int hvnd_negotiate_version(struct hvnd_dev *nd_dev); -int hvnd_init_resources(struct hvnd_dev *nd_dev); -int hvnd_bind_nic(struct hvnd_dev *nd_dev, bool un_bind, char *ip_addr, char *mac_addr); -int hvnd_open_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx); -int hvnd_close_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx); -int hvnd_query_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx); -int hvnd_create_pd(struct hvnd_ucontext *uctx, struct hvnd_dev *nd_dev, - struct hvnd_ib_pd *hvnd_pd); - -/* - * CQ operations. - */ -int hvnd_create_cq(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - struct hvnd_cq *cq); -int hvnd_destroy_cq(struct hvnd_dev *nd_dev, struct hvnd_cq *cq); -int hvnd_notify_cq(struct hvnd_dev *nd_dev, struct hvnd_cq *cq, - u32 notify_type, u64 irp_handle); - -/* - * QP operations. - */ -int hvnd_create_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - struct hvnd_qp *qp); - -int hvnd_free_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - struct hvnd_qp *qp); - -int hvnd_flush_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - struct hvnd_qp *qp); - -/* - * MR operations. - */ - -int hvnd_cr_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 pd_handle, u64 *mr_handle); - -int hvnd_free_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 handle); - -int hvnd_mr_register(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - struct hvnd_mr *mr); -int hvnd_deregister_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 handle); - -/* - * Listner operations - */ -int hvnd_cr_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, u64 *handle); - - -int hvnd_free_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 listener_handle); - -int hvnd_bind_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 listener_handle, union nd_sockaddr_inet *addr); - -int hvnd_listen_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 listener_handle, u32 backlog); - -int hvnd_get_addr_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 listener_handle, union nd_sockaddr_inet *addr); - -int hvnd_get_connection_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 listener_handle, u64 connector_handle, - u64 irp_handle); - -/* - * Connector operations. - */ -int hvnd_cr_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 *connector_handle); - -int hvnd_free_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 handle); - -int hvnd_cancelio_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 handle); -int hvnd_bind_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 handle, union nd_sockaddr_inet *addr); - -int hvnd_connector_connect(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 connector_handle, u32 in_rd_limit, u32 out_rd_limit, - u32 priv_data_length, const u8 *priv_data, - u64 qp_handle, struct if_physical_addr *phys_addr, - union nd_sockaddr_inet *dest_addr, struct hvnd_ep_obj *ep); - -int hvnd_connector_complete_connect(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 connector_handle, enum ibv_qp_state *qp_state); - -int hvnd_connector_accept(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 connector_handle, - u64 qp_handle, - u32 in_rd_limit, u32 out_rd_limit, - u32 priv_data_length, const u8 *priv_data, - enum ibv_qp_state *qp_state, struct hvnd_ep_obj *ep); - -int hvnd_connector_reject(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 connector_handle, - u32 priv_data_length, u8 *priv_data, - enum ibv_qp_state *qp_state); - -int hvnd_connector_get_rd_limits(struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx, - u64 connector_handle, - struct nd_read_limits *rd_limits); - -int hvnd_connector_get_priv_data(struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx, - u64 connector_handle, - u8 *priv_data); - -int hvnd_connector_get_peer_addr(struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx, - u64 connector_handle, - union nd_sockaddr_inet *peer_addr); - -int hvnd_connector_get_local_addr(struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx, - u64 connector_handle, - union nd_sockaddr_inet *local_addr); - -int hvnd_connector_notify_disconnect(struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx, - u64 connector_handle, struct hvnd_ep_obj *ep); - - -int hvnd_connector_disconnect(struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx, - u64 connector_handle, struct hvnd_ep_obj *ep); - -int hvnd_free_handle(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 handle, u32 ioctl); - -int hvnd_cancel_io(struct hvnd_ep_obj *ep_object); - -char *hvnd_get_op_name(int ioctl); - -void hvnd_acquire_uctx_ref(struct hvnd_ucontext *uctx); -void hvnd_drop_uctx_ref(struct hvnd_dev *nd_dev,struct hvnd_ucontext *uctx); -void hvnd_process_events(struct work_struct *work); - -void hvnd_process_cq_event_pending(struct hvnd_ep_obj *ep, int status); -void hvnd_process_cq_event_complete(struct hvnd_ep_obj *ep, int status); -void hvnd_process_connector_accept(struct hvnd_ep_obj *ep_object, int status); -void hvnd_process_notify_disconnect(struct hvnd_ep_obj *ep_object, int status); -void hvnd_process_disconnect(struct hvnd_ep_obj *ep_object, int status); - -void put_irp_handle(struct hvnd_dev *nd_dev, u32 irp); -int get_irp_handle(struct hvnd_dev *nd_dev, u32 *local, void *irp_ctx); - -void hvnd_init_hdr(struct ndv_packet_hdr_control_1 *hdr, - u32 data_sz, u32 local, u32 remote, - u32 ioctl_code, - u32 ext_data_sz, u32 ext_data_offset, - u64 irp_handle); - -int hvnd_send_ioctl_pkt(struct hvnd_dev *nd_dev, - struct ndv_packet_hdr_control_1 *hdr, - u32 pkt_size, u64 cookie); - -int hvnd_get_outgoing_rdma_addr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - union nd_sockaddr_inet *og_addr); - -int hvnd_get_neigh_mac_addr(struct sockaddr *local, struct sockaddr *remote, char *mac_addr); - -void hvnd_addr_init(void); - -void hvnd_addr_deinit(void); - -bool ep_add_work_pending(struct hvnd_ep_obj *ep_object); -void ep_del_work_pending(struct hvnd_ep_obj *ep_object); -void ep_stop(struct hvnd_ep_obj *ep_object); - -#define current_pid() (current->pid) -/* - * NT STATUS defines. - */ - -#define STATUS_SUCCESS 0x0 -#define STATUS_PENDING 0x00000103 -#define STATUS_CANCELLED 0xC0000120 -#define STATUS_DISCONNECTED 0xC000020C -#define STATUS_TIMEOUT 0xC00000B5 - -void inc_ioctl_counter_request(unsigned ioctl); -void inc_ioctl_counter_response(unsigned ioctl); - -#define NDV_PROTOCOL_VAERSION_INVALID -1 -#define NDV_PACKET_INIT_SIZE 16 /* Size of the INIT packet */ - -#define HVND_RING_SZ (PAGE_SIZE * 64) - -/* logging levels */ -#define HVND_ERROR 0 -#define HVND_WARN 1 -#define HVND_INFO 2 -#define HVND_DEBUG 3 - -extern int hvnd_log_level; - -#define hvnd_error(fmt, args...) hvnd_log(HVND_ERROR, fmt, ##args) -#define hvnd_warn(fmt, args...) hvnd_log(HVND_WARN, fmt, ##args) -#define hvnd_info(fmt, args...) hvnd_log(HVND_INFO, fmt, ##args) -#define hvnd_debug(fmt, args...) hvnd_log(HVND_DEBUG, fmt, ##args) - -#define hvnd_log(level, fmt, args...) \ -do { \ - if (unlikely(hvnd_log_level >= (level))) \ - printk(KERN_ERR "hvnd %s[%u]: " fmt, __func__, __LINE__, ##args); \ -} while (0) - -#endif /* _VMBUS_RDMA_H */ diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/COPYING b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/COPYING new file mode 100644 index 000000000000..f0960c946b07 --- /dev/null +++ b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/COPYING @@ -0,0 +1,339 @@ + +"This software program is licensed subject to the GNU General Public License +(GPL). Version 2, June 1991, available at +<http://www.fsf.org/copyleft/gpl.html>" + +GNU General Public License + +Version 2, June 1991 + +Copyright (C) 1989, 1991 Free Software Foundation, Inc. +51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + +Everyone is permitted to copy and distribute verbatim copies of this license +document, but changing it is not allowed. + +Preamble + +The licenses for most software are designed to take away your freedom to +share and change it. By contrast, the GNU General Public License is intended +to guarantee your freedom to share and change free software--to make sure +the software is free for all its users. This General Public License applies +to most of the Free Software Foundation's software and to any other program +whose authors commit to using it. (Some other Free Software Foundation +software is covered by the GNU Library General Public License instead.) You +can apply it to your programs, too. + +When we speak of free software, we are referring to freedom, not price. Our +General Public Licenses are designed to make sure that you have the freedom +to distribute copies of free software (and charge for this service if you +wish), that you receive source code or can get it if you want it, that you +can change the software or use pieces of it in new free programs; and that +you know you can do these things. + +To protect your rights, we need to make restrictions that forbid anyone to +deny you these rights or to ask you to surrender the rights. These +restrictions translate to certain responsibilities for you if you distribute +copies of the software, or if you modify it. + +For example, if you distribute copies of such a program, whether gratis or +for a fee, you must give the recipients all the rights that you have. You +must make sure that they, too, receive or can get the source code. And you +must show them these terms so they know their rights. + +We protect your rights with two steps: (1) copyright the software, and (2) +offer you this license which gives you legal permission to copy, distribute +and/or modify the software. + +Also, for each author's protection and ours, we want to make certain that +everyone understands that there is no warranty for this free software. If +the software is modified by someone else and passed on, we want its +recipients to know that what they have is not the original, so that any +problems introduced by others will not reflect on the original authors' +reputations. + +Finally, any free program is threatened constantly by software patents. We +wish to avoid the danger that redistributors of a free program will +individually obtain patent licenses, in effect making the program +proprietary. To prevent this, we have made it clear that any patent must be +licensed for everyone's free use or not licensed at all. + +The precise terms and conditions for copying, distribution and modification +follow. + +TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION + +0. This License applies to any program or other work which contains a notice + placed by the copyright holder saying it may be distributed under the + terms of this General Public License. The "Program", below, refers to any + such program or work, and a "work based on the Program" means either the + Program or any derivative work under copyright law: that is to say, a + work containing the Program or a portion of it, either verbatim or with + modifications and/or translated into another language. (Hereinafter, + translation is included without limitation in the term "modification".) + Each licensee is addressed as "you". + + Activities other than copying, distribution and modification are not + covered by this License; they are outside its scope. The act of running + the Program is not restricted, and the output from the Program is covered + only if its contents constitute a work based on the Program (independent + of having been made by running the Program). Whether that is true depends + on what the Program does. + +1. You may copy and distribute verbatim copies of the Program's source code + as you receive it, in any medium, provided that you conspicuously and + appropriately publish on each copy an appropriate copyright notice and + disclaimer of warranty; keep intact all the notices that refer to this + License and to the absence of any warranty; and give any other recipients + of the Program a copy of this License along with the Program. + + You may charge a fee for the physical act of transferring a copy, and you + may at your option offer warranty protection in exchange for a fee. + +2. You may modify your copy or copies of the Program or any portion of it, + thus forming a work based on the Program, and copy and distribute such + modifications or work under the terms of Section 1 above, provided that + you also meet all of these conditions: + + * a) You must cause the modified files to carry prominent notices stating + that you changed the files and the date of any change. + + * b) You must cause any work that you distribute or publish, that in + whole or in part contains or is derived from the Program or any part + thereof, to be licensed as a whole at no charge to all third parties + under the terms of this License. + + * c) If the modified program normally reads commands interactively when + run, you must cause it, when started running for such interactive + use in the most ordinary way, to print or display an announcement + including an appropriate copyright notice and a notice that there is + no warranty (or else, saying that you provide a warranty) and that + users may redistribute the program under these conditions, and + telling the user how to view a copy of this License. (Exception: if + the Program itself is interactive but does not normally print such + an announcement, your work based on the Program is not required to + print an announcement.) + + These requirements apply to the modified work as a whole. If identifiable + sections of that work are not derived from the Program, and can be + reasonably considered independent and separate works in themselves, then + this License, and its terms, do not apply to those sections when you + distribute them as separate works. But when you distribute the same + sections as part of a whole which is a work based on the Program, the + distribution of the whole must be on the terms of this License, whose + permissions for other licensees extend to the entire whole, and thus to + each and every part regardless of who wrote it. + + Thus, it is not the intent of this section to claim rights or contest + your rights to work written entirely by you; rather, the intent is to + exercise the right to control the distribution of derivative or + collective works based on the Program. + + In addition, mere aggregation of another work not based on the Program + with the Program (or with a work based on the Program) on a volume of a + storage or distribution medium does not bring the other work under the + scope of this License. + +3. You may copy and distribute the Program (or a work based on it, under + Section 2) in object code or executable form under the terms of Sections + 1 and 2 above provided that you also do one of the following: + + * a) Accompany it with the complete corresponding machine-readable source + code, which must be distributed under the terms of Sections 1 and 2 + above on a medium customarily used for software interchange; or, + + * b) Accompany it with a written offer, valid for at least three years, + to give any third party, for a charge no more than your cost of + physically performing source distribution, a complete machine- + readable copy of the corresponding source code, to be distributed + under the terms of Sections 1 and 2 above on a medium customarily + used for software interchange; or, + + * c) Accompany it with the information you received as to the offer to + distribute corresponding source code. (This alternative is allowed + only for noncommercial distribution and only if you received the + program in object code or executable form with such an offer, in + accord with Subsection b above.) + + The source code for a work means the preferred form of the work for + making modifications to it. For an executable work, complete source code + means all the source code for all modules it contains, plus any + associated interface definition files, plus the scripts used to control + compilation and installation of the executable. However, as a special + exception, the source code distributed need not include anything that is + normally distributed (in either source or binary form) with the major + components (compiler, kernel, and so on) of the operating system on which + the executable runs, unless that component itself accompanies the + executable. + + If distribution of executable or object code is made by offering access + to copy from a designated place, then offering equivalent access to copy + the source code from the same place counts as distribution of the source + code, even though third parties are not compelled to copy the source + along with the object code. + +4. You may not copy, modify, sublicense, or distribute the Program except as + expressly provided under this License. Any attempt otherwise to copy, + modify, sublicense or distribute the Program is void, and will + automatically terminate your rights under this License. However, parties + who have received copies, or rights, from you under this License will not + have their licenses terminated so long as such parties remain in full + compliance. + +5. You are not required to accept this License, since you have not signed + it. However, nothing else grants you permission to modify or distribute + the Program or its derivative works. These actions are prohibited by law + if you do not accept this License. Therefore, by modifying or + distributing the Program (or any work based on the Program), you + indicate your acceptance of this License to do so, and all its terms and + conditions for copying, distributing or modifying the Program or works + based on it. + +6. Each time you redistribute the Program (or any work based on the + Program), the recipient automatically receives a license from the + original licensor to copy, distribute or modify the Program subject to + these terms and conditions. You may not impose any further restrictions + on the recipients' exercise of the rights granted herein. You are not + responsible for enforcing compliance by third parties to this License. + +7. If, as a consequence of a court judgment or allegation of patent + infringement or for any other reason (not limited to patent issues), + conditions are imposed on you (whether by court order, agreement or + otherwise) that contradict the conditions of this License, they do not + excuse you from the conditions of this License. If you cannot distribute + so as to satisfy simultaneously your obligations under this License and + any other pertinent obligations, then as a consequence you may not + distribute the Program at all. For example, if a patent license would + not permit royalty-free redistribution of the Program by all those who + receive copies directly or indirectly through you, then the only way you + could satisfy both it and this License would be to refrain entirely from + distribution of the Program. + + If any portion of this section is held invalid or unenforceable under any + particular circumstance, the balance of the section is intended to apply + and the section as a whole is intended to apply in other circumstances. + + It is not the purpose of this section to induce you to infringe any + patents or other property right claims or to contest validity of any + such claims; this section has the sole purpose of protecting the + integrity of the free software distribution system, which is implemented + by public license practices. Many people have made generous contributions + to the wide range of software distributed through that system in + reliance on consistent application of that system; it is up to the + author/donor to decide if he or she is willing to distribute software + through any other system and a licensee cannot impose that choice. + + This section is intended to make thoroughly clear what is believed to be + a consequence of the rest of this License. + +8. If the distribution and/or use of the Program is restricted in certain + countries either by patents or by copyrighted interfaces, the original + copyright holder who places the Program under this License may add an + explicit geographical distribution limitation excluding those countries, + so that distribution is permitted only in or among countries not thus + excluded. In such case, this License incorporates the limitation as if + written in the body of this License. + +9. The Free Software Foundation may publish revised and/or new versions of + the General Public License from time to time. Such new versions will be + similar in spirit to the present version, but may differ in detail to + address new problems or concerns. + + Each version is given a distinguishing version number. If the Program + specifies a version number of this License which applies to it and "any + later version", you have the option of following the terms and + conditions either of that version or of any later version published by + the Free Software Foundation. If the Program does not specify a version + number of this License, you may choose any version ever published by the + Free Software Foundation. + +10. If you wish to incorporate parts of the Program into other free programs + whose distribution conditions are different, write to the author to ask + for permission. For software which is copyrighted by the Free Software + Foundation, write to the Free Software Foundation; we sometimes make + exceptions for this. Our decision will be guided by the two goals of + preserving the free status of all derivatives of our free software and + of promoting the sharing and reuse of software generally. + + NO WARRANTY + +11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY + FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN + OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES + PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER + EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED + WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE + ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH + YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL + NECESSARY SERVICING, REPAIR OR CORRECTION. + +12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING + WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR + REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR + DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL + DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM + (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED + INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF + THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR + OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. + +END OF TERMS AND CONDITIONS + +How to Apply These Terms to Your New Programs + +If you develop a new program, and you want it to be of the greatest +possible use to the public, the best way to achieve this is to make it free +software which everyone can redistribute and change under these terms. + +To do so, attach the following notices to the program. It is safest to +attach them to the start of each source file to most effectively convey the +exclusion of warranty; and each file should have at least the "copyright" +line and a pointer to where the full notice is found. + +one line to give the program's name and an idea of what it does. +Copyright (C) yyyy name of author + +This program is free software; you can redistribute it and/or modify it +under the terms of the GNU General Public License as published by the Free +Software Foundation; either version 2 of the License, or (at your option) +any later version. + +This program is distributed in the hope that it will be useful, but WITHOUT +ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for +more details. + +You should have received a copy of the GNU General Public License along with +this program; if not, write to the Free Software Foundation, Inc., 59 +Temple Place - Suite 330, Boston, MA 02111-1307, USA. + +Also add information on how to contact you by electronic and paper mail. + +If the program is interactive, make it output a short notice like this when +it starts in an interactive mode: + +Gnomovision version 69, Copyright (C) year name of author Gnomovision comes +with ABSOLUTELY NO WARRANTY; for details type 'show w'. This is free +software, and you are welcome to redistribute it under certain conditions; +type 'show c' for details. + +The hypothetical commands 'show w' and 'show c' should show the appropriate +parts of the General Public License. Of course, the commands you use may be +called something other than 'show w' and 'show c'; they could even be +mouse-clicks or menu items--whatever suits your program. + +You should also get your employer (if you work as a programmer) or your +school, if any, to sign a "copyright disclaimer" for the program, if +necessary. Here is a sample; alter the names: + +Yoyodyne, Inc., hereby disclaims all copyright interest in the program +'Gnomovision' (which makes passes at compilers) written by James Hacker. + +signature of Ty Coon, 1 April 1989 +Ty Coon, President of Vice + +This General Public License does not permit incorporating your program into +proprietary programs. If your program is a subroutine library, you may +consider it more useful to permit linking proprietary applications with the +library. If this is what you want to do, use the GNU Library General Public +License instead of this License. diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/Makefile b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/Makefile new file mode 100644 index 000000000000..19afb983ea70 --- /dev/null +++ b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/Makefile @@ -0,0 +1,3 @@ +obj-$(CONFIG_HYPERV_INFINIBAND_ND) += hv_network_direct_144.0.o + +hv_network_direct_144.0-y := provider.o vmbus_rdma.o hvnd_addr.o diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/hvnd_addr.c b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/hvnd_addr.c new file mode 100644 index 000000000000..c8c2295b63af --- /dev/null +++ b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/hvnd_addr.c @@ -0,0 +1,136 @@ +/* + * Copyright (c) 2014, Microsoft Corporation. + * + * Author: + * K. Y. Srinivasan <kys@microsoft.com> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or + * NON INFRINGEMENT. See the GNU General Public License for more + * details. + * + * Bug fixes/enhancements: Long Li <longli@microsoft.com> + */ + +#include <linux/completion.h> +#include <linux/module.h> +#include <linux/errno.h> +#include <linux/hyperv.h> +#include <linux/efi.h> +#include <linux/slab.h> +#include <linux/cred.h> +#include <linux/sched.h> +#include <linux/types.h> +#include <linux/completion.h> +#include <linux/scatterlist.h> +#include <rdma/ib_umem.h> +#include <rdma/ib_addr.h> + +#include "vmbus_rdma.h" + + +#include <linux/semaphore.h> +#include <linux/fs.h> +#include <linux/nls.h> +#include <linux/workqueue.h> +#include <linux/cdev.h> +#include <linux/hyperv.h> +#include <linux/sched.h> +#include <linux/uaccess.h> +#include <linux/miscdevice.h> +#include <linux/hyperv.h> + + +int hvnd_get_outgoing_rdma_addr(struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx, + union nd_sockaddr_inet *og_addr) +{ + int ret; + /* + * Query the host and select the first address. + */ + struct pkt_query_addr_list pkt; + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + + hvnd_init_hdr(&pkt.hdr, + (sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1)), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST, 0, 0, 0); + + pkt.ioctl.in.version = ND_VERSION_1; + pkt.ioctl.in.reserved = 0; + pkt.ioctl.in.handle = uctx->adaptor_hdl; + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, + sizeof(pkt), (u64)&pkt); + + if (ret) + return ret; + + /* + * Copy the address out. + */ + + memcpy(og_addr, &pkt.ioctl.out[0], sizeof(*og_addr)); + return 0; + +} + +static struct rdma_addr_client self; + +struct resolve_cb_context { + struct rdma_dev_addr *addr; + struct completion comp; +}; + +void hvnd_addr_init(void) +{ + rdma_addr_register_client(&self); +} + +void hvnd_addr_deinit(void) +{ + rdma_addr_unregister_client(&self); +} + +static void resolve_cb(int status, struct sockaddr *src_addr, + struct rdma_dev_addr *addr, void *context) +{ + memcpy(((struct resolve_cb_context *)context)->addr, addr, sizeof(struct + rdma_dev_addr)); + complete(&((struct resolve_cb_context *)context)->comp); +} + +int hvnd_get_neigh_mac_addr(struct sockaddr *local, struct sockaddr *remote, + char *mac_addr) +{ + struct rdma_dev_addr dev_addr; + struct resolve_cb_context ctx; + int ret; + + memset(&dev_addr, 0, sizeof(dev_addr)); + dev_addr.net = &init_net; + ctx.addr = &dev_addr; + init_completion(&ctx.comp); + + ret = rdma_resolve_ip(&self, local, remote, &dev_addr, 1000, + resolve_cb, &ctx); + + if (ret) { + hvnd_error("rdma_resolve_ip failed ret=%d\n", ret); + return ret; + } + + wait_for_completion(&ctx.comp); + memcpy(mac_addr, dev_addr.dst_dev_addr, ETH_ALEN); + return ret; +} diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/mx_abi.h b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/mx_abi.h new file mode 100644 index 000000000000..8f55cb408de0 --- /dev/null +++ b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/mx_abi.h @@ -0,0 +1,232 @@ +/* + * Copyright (c) 2005 Topspin Communications. All rights reserved. + * Copyright (c) 2005 Cisco Systems. All rights reserved. + * Copyright (c) 2005 PathScale, Inc. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * KYS: made some modifications. + */ + +#ifndef MX_ABI_H +#define MX_ABI_H + + + +/* + * Make sure that all structs defined in this file remain laid out so + * that they pack the same way on 32-bit and 64-bit architectures (to + * avoid incompatibility between 32-bit userspace and 64-bit kernels). + * Specifically: + * - Do not use pointer types -- pass pointers in UINT64 instead. + * - Make sure that any structure larger than 4 bytes is padded to a + * multiple of 8 bytes. Otherwise the structure size will be + * different between 32-bit and 64-bit architectures. + */ + +enum ibv_get_context_mappings { + IBV_GET_CONTEXT_UAR, + IBV_GET_CONTEXT_BF, + IBV_GET_CONTEXT_MAPPING_MAX +}; + +struct ibv_get_context_req { + + union nd_mapping mappings[IBV_GET_CONTEXT_MAPPING_MAX]; +}; + +struct ibv_get_context_resp { + + // mmap UAR and BF + struct nd_mapping_result mapping_results[IBV_GET_CONTEXT_MAPPING_MAX]; + + // mmap Blue Flame + int bf_buf_size; + int bf_offset; + + // mlx4_query_device result + int max_qp_wr; + int max_sge; + int max_cqe; + + // general parameters + u32 cqe_size; + u32 vend_id; + u16 dev_id; + u16 bf_reg_size; + u16 bf_regs_per_page; + u16 reserved1; + + // ibv_cmd_get_context result + u32 qp_tab_size; + + u32 reserved2; +}; + +struct ibv_alloc_pd_resp { + u64 pd_handle; + u32 pdn; + u32 reserved; +}; + +struct ibv_reg_mr { + u64 start; + u64 length; + u64 hca_va; + u32 access_flags; + u32 pdn; + u64 pd_handle; +}; + +struct ibv_reg_mr_resp { + u64 mr_handle; + u32 lkey; + u32 rkey; +}; + + +enum mlx4_ib_create_cq_mapping { + MLX4_IB_CREATE_CQ_BUF, + MLX4_IB_CREATE_CQ_DB, + MLX4_IB_CREATE_CQ_ARM_SN, // Windows specific + MLX4_IB_CREATE_CQ_MAPPING_MAX +}; + +#define MLX4_CQ_FLAGS_ARM_IN_KERNEL 1 + +struct ibv_create_cq { + union nd_mapping mappings[MLX4_IB_CREATE_CQ_MAPPING_MAX]; + u32 flags; +}; + +struct ibv_create_cq_resp { + struct nd_mapping_result mapping_results[MLX4_IB_CREATE_CQ_MAPPING_MAX]; + u32 cqn; + u32 cqe; +}; + +enum mlx4_ib_create_srq_mappings { + MLX4_IB_CREATE_SRQ_BUF, + MLX4_IB_CREATE_SRQ_DB, + MLX4_IB_CREATE_SRQ_MAPPINGS_MAX +}; + +struct ibv_create_srq { + union nd_mapping mappings[MLX4_IB_CREATE_SRQ_MAPPINGS_MAX]; +}; + +struct ibv_create_srq_resp { + struct nd_mapping_result mapping_results[MLX4_IB_CREATE_SRQ_MAPPINGS_MAX]; +}; + +enum mlx4_ib_create_qp_mappings { + MLX4_IB_CREATE_QP_BUF, + MLX4_IB_CREATE_QP_DB, + MLX4_IB_CREATE_QP_MAPPINGS_MAX +}; + +struct ibv_create_qp { + union nd_mapping mappings[MLX4_IB_CREATE_QP_MAPPINGS_MAX]; + u8 log_sq_bb_count; + u8 log_sq_stride; + u8 sq_no_prefetch; + u8 reserved; +}; + +struct ibv_create_qp_resp { + struct nd_mapping_result mapping_results[MLX4_IB_CREATE_QP_MAPPINGS_MAX]; + // struct ib_uverbs_create_qp_resp + u64 qp_handle; + u32 qpn; + u32 max_send_wr; + u32 max_recv_wr; + u32 max_send_sge; + u32 max_recv_sge; + u32 max_inline_data; +}; + +enum ibv_qp_attr_mask { + IBV_QP_STATE = 1 << 0, + IBV_QP_CUR_STATE = 1 << 1, + IBV_QP_EN_SQD_ASYNC_NOTIFY = 1 << 2, + IBV_QP_ACCESS_FLAGS = 1 << 3, + IBV_QP_PKEY_INDEX = 1 << 4, + IBV_QP_PORT = 1 << 5, + IBV_QP_QKEY = 1 << 6, + IBV_QP_AV = 1 << 7, + IBV_QP_PATH_MTU = 1 << 8, + IBV_QP_TIMEOUT = 1 << 9, + IBV_QP_RETRY_CNT = 1 << 10, + IBV_QP_RNR_RETRY = 1 << 11, + IBV_QP_RQ_PSN = 1 << 12, + IBV_QP_MAX_QP_RD_ATOMIC = 1 << 13, + IBV_QP_ALT_PATH = 1 << 14, + IBV_QP_MIN_RNR_TIMER = 1 << 15, + IBV_QP_SQ_PSN = 1 << 16, + IBV_QP_MAX_DEST_RD_ATOMIC = 1 << 17, + IBV_QP_PATH_MIG_STATE = 1 << 18, + IBV_QP_CAP = 1 << 19, + IBV_QP_DEST_QPN = 1 << 20 +}; + +enum ibv_qp_state { + IBV_QPS_RESET, + IBV_QPS_INIT, + IBV_QPS_RTR, + IBV_QPS_RTS, + IBV_QPS_SQD, + IBV_QPS_SQE, + IBV_QPS_ERR +}; + + +struct ibv_modify_qp_resp { + enum ibv_qp_attr_mask attr_mask; + u8 qp_state; + u8 reserved[3]; +}; + +struct ibv_create_ah_resp { + u64 start; +}; + +/* + * Some mlx4 specific kernel definitions. Perhaps could be in + * separate file. + */ + +struct mlx4_ib_user_db_page { + struct list_head list; + struct ib_umem *umem; + unsigned long user_virt; + int refcnt; +}; + + +#endif /* MX_ABI_H */ diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/provider.c b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/provider.c new file mode 100644 index 000000000000..7115cf769fe0 --- /dev/null +++ b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/provider.c @@ -0,0 +1,2961 @@ +/* + * Copyright (c) 2014, Microsoft Corporation. + * + * Author: + * K. Y. Srinivasan <kys@microsoft.com> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or + * NON INFRINGEMENT. See the GNU General Public License for more + * details. + * + * Bug fixes/enhancements: Long Li <longli@microsoft.com> + */ + +#include <linux/module.h> +#include <linux/moduleparam.h> +#include <linux/device.h> +#include <linux/netdevice.h> +#include <linux/etherdevice.h> +#include <linux/delay.h> +#include <linux/errno.h> +#include <linux/list.h> +#include <linux/spinlock.h> +#include <linux/ethtool.h> +#include <linux/rtnetlink.h> +#include <linux/inetdevice.h> +#include <linux/io.h> +#include <linux/hyperv.h> +#include <linux/completion.h> +#include <rdma/iw_cm.h> +#include <rdma/ib_verbs.h> +#include <rdma/ib_smi.h> +#include <rdma/ib_umem.h> +#include <rdma/ib_user_verbs.h> +#include <rdma/rdma_cm.h> +#include <linux/proc_fs.h> +#include <linux/seq_file.h> +#include "vmbus_rdma.h" + +/* + * We are emulating mlx4. XXXKYS: May have to FIX. + */ +#include "./user.h" + +static struct hvnd_dev *g_nd_dev; // the one and only one + +int hvnd_log_level = HVND_ERROR; +module_param(hvnd_log_level, int, 0644); +MODULE_PARM_DESC(hvnd_log_level, + "Logging level, 0 - Error (default), 1 - Warning, 2 - Info, 3 - Debug."); + +static int disable_cq_notify = 1; +//static int disable_cq_notify = 0; +module_param(disable_cq_notify, int, 0644); +MODULE_PARM_DESC(disable_cq_notify, + "Disable CQ notification, 0 - Enable, 1 - Disable (default)."); + +enum { + MLX4_USER_DEV_CAP_64B_CQE = 1L << 0 +}; + +#define HVND_NODE_DESC "vmbus-RDMA" + +#undef MLX4_IB_UVERBS_ABI_VERSION +#define MLX4_IB_UVERBS_ABI_VERSION 4 + +struct mlx4_wqe_data_seg { + __be32 byte_count; + __be32 lkey; + __be64 addr; +}; + +/* return value: + * true: ep is running + * false: ep is stopped + */ +bool ep_add_work_pending(struct hvnd_ep_obj *ep_object) +{ + bool ret = true; + + atomic_inc(&ep_object->nr_requests_pending); + if (ep_object->stopping) { + if (atomic_dec_and_test(&ep_object->nr_requests_pending)) + wake_up(&ep_object->wait_pending); + ret = false; + } + return ret; +} + +void ep_del_work_pending(struct hvnd_ep_obj *ep_object) +{ + if (atomic_dec_and_test(&ep_object->nr_requests_pending)) + wake_up(&ep_object->wait_pending); + + if (atomic_read(&ep_object->nr_requests_pending) < 0) { + hvnd_error("ep_object->nr_requests_pending=%d type=%d cm_state=%d\n", + atomic_read(&ep_object->nr_requests_pending), + ep_object->type, ep_object->cm_state); + dump_stack(); + } +} + +void ep_stop(struct hvnd_ep_obj *ep_object) +{ + if (!ep_object->stopping) { + ep_object->stopping = true; + hvnd_cancel_io(ep_object); + } + + if (atomic_read(&ep_object->nr_requests_pending) < 0) { + hvnd_error("IO canceled, ep_object->nr_requests_pending=%d type=%d cm_state=%d\n", + atomic_read(&ep_object->nr_requests_pending), + ep_object->type, ep_object->cm_state); + dump_stack(); + } + + wait_event(ep_object->wait_pending, + !atomic_read(&ep_object->nr_requests_pending)); +} + +static int vmbus_dma_map_sg(struct device *dev, struct scatterlist *sgl, + int nents, enum dma_data_direction direction, + unsigned long attrs) +{ + struct scatterlist *sg; + u64 addr; + int i; + int ret = nents; + + BUG_ON(!valid_dma_direction(direction)); + + for_each_sg(sgl, sg, nents, i) { + addr = (u64) page_address(sg_page(sg)); + /* TODO: handle highmem pages */ + if (!addr) { + ret = 0; + break; + } + sg->dma_address = addr + sg->offset; + sg->dma_length = sg->length; + } + return ret; +} + +static void vmbus_dma_unmap_sg(struct device *dev, + struct scatterlist *sg, int nents, + enum dma_data_direction direction, unsigned long attrs) +{ + BUG_ON(!valid_dma_direction(direction)); +} + + +const struct dma_map_ops vmbus_dma_ops = { + .map_sg = vmbus_dma_map_sg, + .unmap_sg = vmbus_dma_unmap_sg, +}; + +static int hvnd_get_incoming_connections(struct hvnd_ep_obj *listener, + struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx); + +static struct hvnd_ep_obj *hvnd_setup_ep(struct iw_cm_id *cm_id, int ep_type, + struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx); + +static void hvnd_deinit_ep(struct hvnd_ep_obj *ep) +{ + put_irp_handle(ep->nd_dev, ep->local_irp); +} + +static void hvnd_destroy_ep(struct hvnd_ep_obj *ep) +{ + hvnd_debug("canceling work for ep %p\n", ep); + cancel_work_sync(&ep->wrk.work); + hvnd_deinit_ep(ep); + kfree(ep); +} + + +#define UC(b) (((int)b)&0xff) +char *debug_inet_ntoa(struct in_addr in, char *b) +{ + register char *p; + + p = (char *)∈ + (void)snprintf(b, 20, + "%d.%d.%d.%d", UC(p[0]), UC(p[1]), UC(p[2]), UC(p[3])); + return b; +} + +void hvnd_process_events(struct work_struct *work); + +static int hvnd_init_ep(struct hvnd_ep_obj *ep_object, + struct iw_cm_id *cm_id, int ep_type, + struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx) +{ + int ret; + + ep_object->type = ep_type; + ep_object->cm_id = cm_id; + ep_object->nd_dev = nd_dev; + ep_object->uctx = uctx; + + ep_object->parent = NULL; + + ep_object->wrk.callback_arg = ep_object; + INIT_WORK(&ep_object->wrk.work, hvnd_process_events); + INIT_LIST_HEAD(&ep_object->incoming_pkt_list); + spin_lock_init(&ep_object->incoming_pkt_list_lock); + + ep_object->stopping = false; + atomic_set(&ep_object->nr_requests_pending, 0); + init_waitqueue_head(&ep_object->wait_pending); + + ret = get_irp_handle(nd_dev, &ep_object->local_irp, (void *)ep_object); + + if (ret) { + hvnd_error("get_irp_handle() failed: err: %d\n", ret); + return ret; + } + return 0; + +} + +static int set_rq_size(struct hvnd_dev *dev, struct ib_qp_cap *cap, + struct hvnd_qp *qp) +{ + + /* HW requires >= 1 RQ entry with >= 1 gather entry */ + if (!cap->max_recv_wr || !cap->max_recv_sge) + return -EINVAL; + + qp->rq_wqe_cnt = roundup_pow_of_two(max(1U, cap->max_recv_wr)); + qp->rq_max_gs = roundup_pow_of_two(max(1U, cap->max_recv_sge)); + qp->rq_wqe_shift = ilog2( + qp->rq_max_gs * sizeof(struct mlx4_wqe_data_seg)); + + + return 0; +} + +static int set_user_sq_size(struct hvnd_dev *dev, + struct hvnd_qp *qp, + struct mlx4_ib_create_qp *ucmd) +{ + qp->sq_wqe_cnt = 1 << ucmd->log_sq_bb_count; + qp->sq_wqe_shift = ucmd->log_sq_stride; + + qp->buf_size = (qp->rq_wqe_cnt << qp->rq_wqe_shift) + + (qp->sq_wqe_cnt << qp->sq_wqe_shift); + + return 0; +} + +static int hvnd_db_map_user(struct hvnd_ucontext *uctx, unsigned long virt, + struct ib_umem **db_umem) +{ + struct mlx4_ib_user_db_page *page; + int err = 0; + + mutex_lock(&uctx->db_page_mutex); + + list_for_each_entry(page, &uctx->db_page_list, list) + if (page->user_virt == (virt & PAGE_MASK)) + goto found; + + page = kmalloc(sizeof(*page), GFP_KERNEL); + if (!page) { + err = -ENOMEM; + goto out; + } + + page->user_virt = (virt & PAGE_MASK); + page->refcnt = 0; + page->umem = ib_umem_get(&uctx->ibucontext, virt & PAGE_MASK, + PAGE_SIZE, 0, 0); + if (IS_ERR(page->umem)) { + hvnd_error("ib_umem_get failure\n"); + err = PTR_ERR(page->umem); + kfree(page); + goto out; + } + + list_add(&page->list, &uctx->db_page_list); + +found: + ++page->refcnt; +out: + mutex_unlock(&uctx->db_page_mutex); + if (!err) + *db_umem = page->umem; + + return err; +} + +static void hvnd_db_unmap_user(struct hvnd_ucontext *uctx, u64 db_addr) +{ + struct mlx4_ib_user_db_page *page; + + mutex_lock(&uctx->db_page_mutex); + list_for_each_entry(page, &uctx->db_page_list, list) + if (page->user_virt == (db_addr & PAGE_MASK)) + goto found; + +found: + if (!--page->refcnt) { + list_del(&page->list); + ib_umem_release(page->umem); + kfree(page); + } + + mutex_unlock(&uctx->db_page_mutex); +} + + +static void debug_check(const char *func, int line) +{ + hvnd_debug("func is: %s; line is %d\n", func, line); + + if (in_interrupt()) { + hvnd_error("In interrupt func is: %s; line is %d\n", + func, line); + return; + } +} + +static struct ib_ah *hvnd_ah_create(struct ib_pd *pd, + struct ib_ah_attr *ah_attr, + struct ib_udata *udata) +{ + debug_check(__func__, __LINE__); + return ERR_PTR(-ENOSYS); +} + +static int hvnd_ah_destroy(struct ib_ah *ah) +{ + debug_check(__func__, __LINE__); + return -ENOSYS; +} + +static int hvnd_multicast_attach(struct ib_qp *ibqp, union ib_gid *gid, u16 lid) +{ + debug_check(__func__, __LINE__); + return -ENOSYS; +} + +static int hvnd_multicast_detach(struct ib_qp *ibqp, union ib_gid *gid, u16 lid) +{ + debug_check(__func__, __LINE__); + return -ENOSYS; +} + +static int hvnd_process_mad(struct ib_device *ibdev, + int mad_flags, + u8 port_num, + const struct ib_wc *in_wc, + const struct ib_grh *in_grh, + const struct ib_mad_hdr *in_mad, + size_t in_mad_size, + struct ib_mad_hdr *out_mad, + size_t *out_mad_size, + u16 *out_mad_pkey_index) +{ + debug_check(__func__, __LINE__); + return -ENOSYS; +} + +void hvnd_acquire_uctx_ref(struct hvnd_ucontext *uctx) +{ + atomic_inc(&uctx->refcnt); +} + +void hvnd_drop_uctx_ref(struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx) +{ + if (atomic_dec_and_test(&uctx->refcnt)) { + hvnd_debug("uctx ref cnt dropped it is %d\n", + atomic_read(&uctx->refcnt)); + hvnd_debug("About to close adaptor\n"); + hvnd_close_adaptor(nd_dev, uctx); + } else + hvnd_debug("uctx ref cnt dropped it is %d\n", + atomic_read(&uctx->refcnt)); +} + + +static int hvnd_dealloc_ucontext(struct ib_ucontext *context) +{ + struct hvnd_dev *nd_dev; + struct hvnd_ucontext *uctx; + + uctx = to_nd_context(context); + nd_dev = to_nd_dev(context->device); + + hvnd_debug("calling %s\n", __func__); + + hvnd_drop_uctx_ref(nd_dev, uctx); + + return 0; +} + +static struct ib_ucontext *hvnd_alloc_ucontext(struct ib_device *ibdev, + struct ib_udata *udata) +{ + struct hvnd_dev *nd_dev = to_nd_dev(ibdev); + struct hvnd_ucontext *uctx; + struct mlx4_ib_alloc_ucontext_resp resp; + struct ibv_get_context_resp *ctx_output; + int ret; + + if (!nd_dev->ib_active) { + hvnd_error("ib device is not active, try again\n"); + return ERR_PTR(-EAGAIN); + } + + uctx = get_uctx(nd_dev, current_pid()); + if (uctx) { + // it is already opened, just increase its reference count + hvnd_acquire_uctx_ref(uctx); + } else { + + /* + * The Windows host expects the following to be done: + * 1. Successfully send struct ndv_pkt_hdr_create_1 + * 2. INIT PROVIDER + * 3. Open Adapter + * Before we can complete this call. + */ + + uctx = kzalloc(sizeof(struct hvnd_ucontext), GFP_KERNEL); + if (!uctx) + return ERR_PTR(-ENOMEM); + + atomic_set(&uctx->refcnt, 1); + INIT_LIST_HEAD(&uctx->db_page_list); + mutex_init(&uctx->db_page_mutex); + + /* + * Stash away the context with the calling PID. + */ + ret = insert_handle(nd_dev, &nd_dev->uctxidr, uctx, + current_pid()); + if (ret) { + hvnd_error("Uctx ID insertion failed; ret is %d\n", + ret); + goto err1; + } + + hvnd_debug("Opening adaptor pid is %d\n", current_pid()); + + ret = hvnd_open_adaptor(nd_dev, uctx); + if (ret) { + hvnd_error("hvnd_open_adaptor failed ret=%d\n", ret); + goto err1; + } + + } + + /* + * Copy the response out. + */ + ctx_output = &uctx->o_adap_pkt.mappings.ctx_output; + resp.dev_caps = MLX4_USER_DEV_CAP_64B_CQE; + resp.qp_tab_size = ctx_output->qp_tab_size; + resp.bf_reg_size = ctx_output->bf_reg_size; + resp.bf_regs_per_page = ctx_output->bf_regs_per_page; + resp.cqe_size = ctx_output->cqe_size; + + ret = ib_copy_to_udata(udata, &resp, sizeof(resp)); + if (ret) { + hvnd_error("ib_copy_to_udata failed ret=%d\n", ret); + goto err1; + } + + return &uctx->ibucontext; + +err1: + kfree(uctx); + return ERR_PTR(ret); +} + +static int hvnd_mmap(struct ib_ucontext *context, struct vm_area_struct *vma) +{ + struct hvnd_ucontext *uctx = to_nd_context(context); + + if (vma->vm_end - vma->vm_start != PAGE_SIZE) { + hvnd_error("vma not a page size, actual size=%lu\n", + vma->vm_end - vma->vm_start); + return -EINVAL; + } + + if (vma->vm_pgoff == 0) { + vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); + + if (io_remap_pfn_range(vma, vma->vm_start, + (uctx->uar_base >> PAGE_SHIFT), + PAGE_SIZE, vma->vm_page_prot)) { + hvnd_error("io_remap_pfn_range failure\n"); + return -EAGAIN; + } + } else if (vma->vm_pgoff == 1 && uctx->bf_buf_size != 0) { + vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot); + + if (io_remap_pfn_range(vma, vma->vm_start, + (uctx->uar_base >> PAGE_SHIFT) + 1, + PAGE_SIZE, vma->vm_page_prot)) { + hvnd_error("io_remap_pfn_range failure\n"); + return -EAGAIN; + } + } else { + hvnd_error("check code\n"); + return -EINVAL; + } + + return 0; +} + +static int hvnd_deallocate_pd(struct ib_pd *pd) +{ + + struct hvnd_ucontext *uctx; + struct hvnd_dev *nd_dev; + struct hvnd_ib_pd *hvnd_pd; + struct ib_ucontext *ibuctx = pd->uobject->context; + + hvnd_pd = to_nd_pd(pd); + nd_dev = to_nd_dev(pd->device); + uctx = to_nd_context(ibuctx); + + hvnd_free_handle(nd_dev, uctx, hvnd_pd->handle, + IOCTL_ND_PD_FREE); + + hvnd_drop_uctx_ref(nd_dev, uctx); + return 0; +} + +static struct ib_pd *hvnd_allocate_pd(struct ib_device *ibdev, + struct ib_ucontext *context, + struct ib_udata *udata) +{ + struct hvnd_ucontext *uctx; + struct hvnd_dev *nd_dev; + int ret; + struct hvnd_ib_pd *hvnd_pd; + + if (!context) { + hvnd_error("kernel mode context not supported\n"); + return ERR_PTR(-EINVAL); + } + + hvnd_pd = kzalloc(sizeof(struct hvnd_ib_pd), GFP_KERNEL); + + if (!hvnd_pd) + return ERR_PTR(-ENOMEM); + + uctx = to_nd_context(context); + nd_dev = to_nd_dev(ibdev); + + ret = hvnd_create_pd(uctx, nd_dev, hvnd_pd); + if (ret) { + hvnd_error("hvnd_create_pd failure ret=%d\n", ret); + goto error_cr_pd; + } + + if (context) { + if (ib_copy_to_udata(udata, &hvnd_pd->pdn, sizeof(__u32))) { + hvnd_error("ib_copy_to_udata failure\n"); + ret = -EFAULT; + goto error_fault; + } + } + + hvnd_acquire_uctx_ref(uctx); + return &hvnd_pd->ibpd; + +error_fault: + hvnd_free_handle(nd_dev, uctx, hvnd_pd->handle, + IOCTL_ND_PD_FREE); + +error_cr_pd: + kfree(hvnd_pd); + return ERR_PTR(ret); +} + +static int hvnd_query_pkey(struct ib_device *ibdev, u8 port, u16 index, + u16 *pkey) +{ + debug_check(__func__, __LINE__); + *pkey = 0; + return 0; +} + +static int hvnd_query_gid(struct ib_device *ibdev, u8 port, int index, + union ib_gid *gid) +{ + int ret; + struct hvnd_dev *nd_dev = to_nd_dev(ibdev); + + debug_check(__func__, __LINE__); + + ret = wait_for_completion_timeout(&nd_dev->addr_set, 60*HZ); + if (!ret) + return -ETIMEDOUT; + + memset(&(gid->raw[0]), 0, sizeof(gid->raw)); + memcpy(&(gid->raw[0]), nd_dev->mac_addr, 6); + return 0; +} + +static int hvnd_query_device(struct ib_device *ibdev, + struct ib_device_attr *props, + struct ib_udata *udata) +{ + struct hvnd_dev *nd_dev = to_nd_dev(ibdev); + struct adapter_info_v2 *adap_info; + + if (!nd_dev->query_pkt_set) { + hvnd_error("query packet not received yet\n"); + return 0; + } + + adap_info = &nd_dev->query_pkt.ioctl.ad_info; + + memset(props, 0, sizeof(*props)); + + /* + * Copy the relevant properties out. + */ + props->fw_ver = 0; + props->device_cap_flags = 0; + //props->device_cap_flags |= IB_DEVICE_BAD_PKEY_CNTR; + //props->device_cap_flags |= IB_DEVICE_BAD_QKEY_CNTR; + //props->device_cap_flags |= IB_DEVICE_XRC; + + props->vendor_id = 0x15b3; + props->vendor_part_id = adap_info->device_id; + + props->max_mr_size = ~0ull; + props->page_size_cap = PAGE_SIZE; + props->max_qp = 16384; + props->max_qp_wr = min(adap_info->max_recv_q_depth, + adap_info->max_initiator_q_depth); + + props->max_sge = min(adap_info->max_initiator_sge, + adap_info->max_recv_sge); + props->max_cq = 0x1FFFF; + props->max_cqe = adap_info->max_completion_q_depth; + props->max_mr = 16384; + props->max_pd = 16384; + + props->max_qp_rd_atom = adap_info->max_inbound_read_limit; + props->max_qp_init_rd_atom = adap_info->max_outbound_read_limit; + props->max_res_rd_atom = props->max_qp_rd_atom * props->max_qp; + props->max_srq = 16384; + props->max_srq_wr = adap_info->max_recv_q_depth; + props->max_srq_sge = adap_info->max_recv_sge; + + return 0; +} + +static int hvnd_query_port(struct ib_device *ibdev, u8 port, + struct ib_port_attr *props) +{ + memset(props, 0, sizeof(struct ib_port_attr)); + + props->max_mtu = IB_MTU_4096; + props->active_mtu = IB_MTU_4096; + + /* + * KYS: TBD need to base this on netdev. + */ + props->state = IB_PORT_ACTIVE; + + props->port_cap_flags = IB_PORT_CM_SUP; + + props->gid_tbl_len = 1; + props->pkey_tbl_len = 1; + props->active_width = 1; + props->active_speed = IB_SPEED_DDR; //KYS: check + props->max_msg_sz = -1; + + return 0; +} + +static enum rdma_link_layer +hvnd_get_link_layer(struct ib_device *device, u8 port) +{ + return IB_LINK_LAYER_ETHERNET; +} + +static ssize_t hvnd_show_rev(struct device *dev, struct device_attribute *attr, + char *buf) +{ + return 0; +} + +static ssize_t hvnd_show_fw_ver(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return 0; +} + +static ssize_t hvnd_show_hca(struct device *dev, struct device_attribute *attr, + char *buf) +{ + return 0; +} + +static ssize_t hvnd_show_board(struct device *dev, + struct device_attribute *attr, char *buf) +{ + return 0; +} + +static int hvnd_get_port_immutable(struct ib_device *ibdev, u8 port_num, + struct ib_port_immutable *immutable) +{ + struct ib_port_attr attr; + int err; + + err = hvnd_query_port(ibdev, port_num, &attr); + if (err) + return err; + + immutable->pkey_tbl_len = attr.pkey_tbl_len; + immutable->gid_tbl_len = attr.gid_tbl_len; + immutable->core_cap_flags = RDMA_CORE_PORT_IWARP; + + return 0; +} + +static struct ib_qp *hvnd_ib_create_qp(struct ib_pd *pd, + struct ib_qp_init_attr *attrs, + struct ib_udata *udata) +{ + struct hvnd_ucontext *uctx; + struct hvnd_dev *nd_dev; + struct mlx4_ib_create_qp ucmd; + struct hvnd_qp *qp; + int ret = 0; + struct hvnd_ib_pd *hvnd_pd = to_nd_pd(pd); + struct hvnd_cq *send_cq = to_nd_cq(attrs->send_cq); + struct hvnd_cq *recv_cq = to_nd_cq(attrs->recv_cq); + + uctx = get_uctx_from_pd(pd); + nd_dev = to_nd_dev(pd->device); + + if (attrs->qp_type != IB_QPT_RC) { + hvnd_error("attrs->qp_type=%d not IB_QPT_RC\n", attrs->qp_type); + return ERR_PTR(-EINVAL); + } + + qp = kzalloc(sizeof(*qp), GFP_KERNEL); + if (!qp) { + ret = -ENOMEM; + goto err_done; + } + + qp->uctx = uctx; + + if (ib_copy_from_udata(&ucmd, udata, sizeof(ucmd))) { + hvnd_error("ib_copy_from_udata failed\n"); + ret = -EFAULT; + goto err_ucpy; + } + + qp->qp_buf = (void *)ucmd.buf_addr; + qp->db_addr = (void *)ucmd.db_addr; + qp->log_sq_bb_count = ucmd.log_sq_bb_count; + qp->log_sq_stride = ucmd.log_sq_stride; + qp->sq_no_prefetch = ucmd.sq_no_prefetch; + qp->port = attrs->port_num; + + init_waitqueue_head(&qp->wait); + atomic_set(&qp->refcnt, 1); + + qp->recv_cq = recv_cq; + qp->send_cq = send_cq; + qp->nd_dev = nd_dev; + + qp->receive_cq_handle = recv_cq->cq_handle; + qp->initiator_cq_handle = send_cq->cq_handle; + qp->pd_handle = hvnd_pd->handle; + qp->cq_notify = false; + + qp->ibqp.qp_num = attrs->qp_type == IB_QPT_SMI ? 0 : 1; + + qp->max_inline_data = attrs->cap.max_inline_data; + + qp->initiator_q_depth = attrs->cap.max_send_wr; + qp->initiator_request_sge = attrs->cap.max_send_sge; + + + qp->receive_q_depth = attrs->cap.max_recv_wr; + qp->receive_request_sge = attrs->cap.max_recv_sge; + + set_rq_size(nd_dev, &attrs->cap, qp); + + set_user_sq_size(nd_dev, qp, &ucmd); + + qp->umem = ib_umem_get(&uctx->ibucontext, ucmd.buf_addr, + qp->buf_size, 0, 0); + if (IS_ERR(qp->umem)) { + ret = PTR_ERR(qp->umem); + hvnd_error("ib_umem_get failed ret=%d\n", ret); + goto err_ucpy; + } + + ret = hvnd_db_map_user(uctx, ucmd.db_addr, &qp->db_umem); + + if (ret) { + hvnd_error("hvnd_db_map_user failed ret=%d\n", ret); + goto err_db_map; + } + + ret = hvnd_create_qp(nd_dev, uctx, qp); + + if (ret) { + hvnd_error("hvnd_create_qp failed ret=%d\n", ret); + goto err_qp; + } + + hvnd_acquire_uctx_ref(uctx); + + qp->ibqp.qp_num = qp->qpn; + qp->ibqp.qp_type = IB_QPT_RC; + + + return &qp->ibqp; + +err_qp: + hvnd_db_unmap_user(uctx, ucmd.db_addr); + +err_db_map: + ib_umem_release(qp->umem); + +err_ucpy: + kfree(qp); +err_done: + return ERR_PTR(ret); +} + +static int hvnd_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, + int attr_mask, struct ib_udata *udata) +{ + struct hvnd_qp *qp = to_nd_qp(ibqp); + struct hvnd_dev *nd_dev = to_nd_dev(ibqp->device); + enum ib_qp_state cur_state, new_state; + int ret = 0; + + if (attr != NULL) { + cur_state = (attr_mask & IB_QP_CUR_STATE) + ? attr->cur_qp_state : qp->qp_state; + new_state = (attr_mask & IB_QP_STATE) + ? attr->qp_state : cur_state; + + hvnd_debug("qp->qp_state is %d new state is %d\n", + qp->qp_state, new_state); + hvnd_debug("current qp state is %d\n", cur_state); + if (attr_mask & IB_QP_STATE) { + /* Ensure the state is valid */ + if (attr->qp_state < 0 || attr->qp_state > IB_QPS_ERR) { + hvnd_error("incorrect qp state attr->qp_state=%d\n", + attr->qp_state); + return -EINVAL; + } + + if (qp->qp_state != new_state) { + qp->qp_state = new_state; + /* + * The only state transition supported is the + * transition to error state. + */ + switch (new_state) { + case IB_QPS_ERR: + case IB_QPS_SQD: + ret = hvnd_flush_qp(nd_dev, qp->uctx, + qp); + + if (ret) + hvnd_error("hvnd_flush_qp failed ret=%d\n", ret); + + // immediately notify the upper layer + // on disconnection + if (!ret && qp->connector) + hvnd_process_notify_disconnect( + qp->connector, + STATUS_SUCCESS); + + return ret; + + default: + break; + } + } + } + } + return 0; +} + + +static int hvnd_ib_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, + int attr_mask, struct ib_qp_init_attr *init_attr) +{ + struct hvnd_qp *qp = to_nd_qp(ibqp); + + memset(attr, 0, sizeof(*attr)); + memset(init_attr, 0, sizeof(*init_attr)); + + attr->qp_state = qp->qp_state; + + init_attr->cap.max_send_wr = qp->max_send_wr; + init_attr->cap.max_recv_wr = qp->max_recv_wr; + + init_attr->cap.max_send_sge = qp->max_send_sge; + init_attr->cap.max_recv_sge = qp->max_recv_sge; + init_attr->cap.max_inline_data = qp->max_inline_data; + + init_attr->sq_sig_type = IB_SIGNAL_ALL_WR; + + return 0; +} + +static void hvnd_refuse_connection(struct hvnd_ep_obj *connector, int status); +static int hvnd_destroy_qp(struct ib_qp *ib_qp) +{ + int ret; + struct hvnd_qp *qp = to_nd_qp(ib_qp); + struct hvnd_dev *nd_dev = to_nd_dev(ib_qp->device); + u64 jiffies; + + if (!qp->connector) { + hvnd_warn("error: connector is NULL; skip destroying connector\n"); + goto free_qp; + } + + /* + * should we flush the qp first on ctrl-C?, + * no need to disconnect on abrupt shutdown? + */ + if (qp->qp_state != IB_QPS_ERR && qp->qp_state != IB_QPS_SQD) { + hvnd_warn("qp_state=%d, doing abrupt disconnect\n", + qp->qp_state); + hvnd_flush_qp(nd_dev, qp->uctx, qp); + + ep_stop(qp->connector); + + // now no pending activity is possible on the connector + + switch (qp->connector->cm_state) { + + case hvnd_cm_idle: + case hvnd_cm_connect_reply_refused: + case hvnd_cm_connect_request_sent: + case hvnd_cm_close_sent: + hvnd_warn("cm_state = %d not doing anything\n", + qp->connector->cm_state); + break; + + case hvnd_cm_connect_received: + hvnd_warn("cm_state = %d refusing pending connection request\n", + qp->connector->cm_state); + hvnd_refuse_connection(qp->connector, -ECONNREFUSED); + break; + + case hvnd_cm_connect_reply_sent: + case hvnd_cm_established_sent: + case hvnd_cm_accept_sent: + hvnd_warn("cm_state = %d notifying disconnect on existing connection\n", + qp->connector->cm_state); + hvnd_process_notify_disconnect(qp->connector, + STATUS_CANCELLED); + break; + + default: + hvnd_error("unknown cm_state = %d\n", + qp->connector->cm_state); + + } + goto free_connector; + } else { + hvnd_debug("qp_state=%d, doing normal disconnect\n", + qp->qp_state); + } + + if (!ep_add_work_pending(qp->connector)) + goto free_connector; + + init_completion(&qp->connector->disconnect_event); + + /* + * First issue a disconnect on the connector. + */ + + hvnd_debug("calling hvnd_connector_disconnect\n"); + ret = hvnd_connector_disconnect(nd_dev, qp->uctx, + qp->connector->ep_handle, + qp->connector); + if (ret) { + ep_del_work_pending(qp->connector); + hvnd_error("disconnect: retval is %d\n", ret); + ep_stop(qp->connector); + goto free_connector; + } + /* + * Now wait for the disconnect. + */ + jiffies = get_jiffies_64(); + wait_for_completion(&qp->connector->disconnect_event); + hvnd_debug("Completed disconnect connector=%p jiffies=%llu\n", + qp->connector, get_jiffies_64() - jiffies); + + /* + * Now free up the connector and drop the reference on uctx. + */ + + ep_stop(qp->connector); + +free_connector: + hvnd_debug("destroying connector handle: %p\n", + (void *) qp->connector->ep_handle); + hvnd_free_handle(nd_dev, qp->uctx, + qp->connector->ep_handle, + IOCTL_ND_CONNECTOR_FREE); + + hvnd_drop_uctx_ref(nd_dev, qp->uctx); + hvnd_destroy_ep(qp->connector); + qp->connector = NULL; +free_qp: + atomic_dec(&qp->refcnt); + hvnd_debug("Waiting for the ref cnt to go to 0\n"); + + wait_event(qp->wait, !atomic_read(&qp->refcnt)); + + hvnd_debug("About to destroy qp\n"); + hvnd_db_unmap_user(qp->uctx, (u64)qp->db_addr); + ib_umem_release(qp->umem); + + hvnd_debug("About to free qp\n"); + ret = hvnd_free_qp(nd_dev, qp->uctx, qp); + + if (ret == 0) { + hvnd_drop_uctx_ref(nd_dev, qp->uctx); + kfree(qp); + } else { + hvnd_error("free qp failed: ret is %d\n", ret); + } + + return ret; +} + +static struct ib_cq *hvnd_ib_create_cq(struct ib_device *ibdev, + const struct ib_cq_init_attr *attr, + struct ib_ucontext *ib_context, + struct ib_udata *udata) +{ + struct hvnd_ucontext *uctx; + struct hvnd_dev *nd_dev; + struct mlx4_ib_create_cq ucmd; + struct hvnd_cq *cq; + int ret = 0; + int entries = attr->cqe; + + uctx = to_nd_context(ib_context); + nd_dev = to_nd_dev(ibdev); + + if (entries < 1 || entries > uctx->max_cqe) { + hvnd_error("incorrct entries=%d\n", entries); + ret = -EINVAL; + goto err_done; + } + + cq = kzalloc(sizeof(*cq), GFP_KERNEL); + if (!cq) { + ret = -ENOMEM; + goto err_done; + } + + entries = roundup_pow_of_two(entries + 1); + cq->ibcq.cqe = entries - 1; + cq->entries = entries; + cq->uctx = uctx; + + if (ib_copy_from_udata(&ucmd, udata, sizeof(ucmd))) { + hvnd_error("ib_copy_from_udata failed\n"); + ret = -EFAULT; + goto err_ucpy; + } + + cq->cq_buf = (void *)ucmd.buf_addr; + cq->db_addr = (void *)ucmd.db_addr; + cq->arm_sn = 0; + + /* + * Initialize the IRP state. Need to have a separate irp state + * for CQ; for now share it with Listener/connector. + */ + ret = hvnd_init_ep(&cq->ep_object, NULL, ND_CQ, nd_dev, uctx); + + if (ret) { + hvnd_error("hvnd_init_ep failed ret=%d\n", ret); + goto err_ucpy; + } + + cq->ep_object.cq = cq; + cq->monitor = true; + + cq->umem = ib_umem_get(ib_context, ucmd.buf_addr, + (entries * uctx->cqe_size), + IB_ACCESS_LOCAL_WRITE, 1); + if (IS_ERR(cq->umem)) { + ret = IS_ERR(cq->umem); + hvnd_error("ib_umem_get failed ret=%d\n", ret); + goto err_ucpy; + } + + ret = hvnd_db_map_user(uctx, ucmd.db_addr, &cq->db_umem); + + if (ret) { + hvnd_error("hvnd_db_map_user failed ret=%d\n", ret); + goto err_db_map; + } + + ret = hvnd_create_cq(nd_dev, uctx, cq); + + if (ret) { + hvnd_error("hvnd_create_cq failed ret=%d\n", ret); + goto err_cq; + } + + cq->ep_object.ep_handle = cq->cq_handle; + + if (ib_copy_to_udata(udata, &cq->cqn, sizeof(__u32))) { + hvnd_error("ib_copy_to_udata failed\n"); + ret = -EFAULT; + goto err_ucpy_out; + } + + if (!disable_cq_notify) { + + if (!ep_add_work_pending(&cq->ep_object)) + goto err_ucpy_out; + + ret = hvnd_notify_cq(nd_dev, cq, ND_CQ_NOTIFY_ANY, + (u64)&cq->ep_object); + + if (ret) { + ep_del_work_pending(&cq->ep_object); + hvnd_error("hvnd_notify_cq failed ret=%d\n", ret); + goto err_ucpy_out; + } + } + + hvnd_acquire_uctx_ref(uctx); + + return &cq->ibcq; + +err_ucpy_out: + hvnd_destroy_cq(nd_dev, cq); + +err_cq: + hvnd_db_unmap_user(uctx, ucmd.db_addr); + +err_db_map: + ib_umem_release(cq->umem); + +err_ucpy: + kfree(cq); +err_done: + return ERR_PTR(ret); +} + +static struct ib_qp *hvnd_get_qp(struct ib_device *dev, int qpn) +{ + struct hvnd_dev *nd_dev; + struct hvnd_qp *qp = NULL; + + nd_dev = to_nd_dev(dev); + qp = get_qpp(nd_dev, qpn); + return qp ? &qp->ibqp : NULL; +} + +static int hvnd_ib_destroy_cq(struct ib_cq *ib_cq) +{ + struct hvnd_ucontext *uctx; + struct hvnd_dev *nd_dev; + struct hvnd_cq *cq; + + cq = to_nd_cq(ib_cq); + uctx = cq->uctx; + nd_dev = to_nd_dev(uctx->ibucontext.device); + + cq->monitor = false; + +// hvnd_cancel_io(&cq->ep_object); + ep_stop(&cq->ep_object); + + hvnd_deinit_ep(&cq->ep_object); + + hvnd_db_unmap_user(uctx, (u64)cq->db_addr); + ib_umem_release(cq->umem); + + hvnd_destroy_cq(nd_dev, cq); + + hvnd_drop_uctx_ref(nd_dev, uctx); + kfree(cq); + + return 0; +} + +static int hvnd_resize_cq(struct ib_cq *cq, int cqe, struct ib_udata *udata) +{ + /* + * NDDirect does not support resizing CQ. + */ + hvnd_info("check code\n"); + return -ENOSYS; +} + +static int hvnd_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc) +{ + hvnd_info("check code\n"); + return 0; +} + +static struct ib_mr *hvnd_get_dma_mr(struct ib_pd *pd, int acc) +{ + hvnd_info("check code\n"); + return NULL; +} + +static void debug_dump_umem(struct ib_umem *umem) +{ +#ifdef HVND_MEM_DEBUG + struct ib_umem_chunk *chunk; + struct scatterlist *sg; + int len, j, entry; + int shift = ffs(umem->page_size) - 1; + + hvnd_debug("umem=%p\n", umem); + hvnd_debug("context=%p length=%lu offset=%d page_size=%d writable=%d hugetlb=%d\n", + umem->context, + umem->length, + umem->offset, + umem->page_size, + umem->writable, + umem->hugetlb); + + list_for_each_entry(chunk, &umem->chunk_list, list) { + hvnd_debug("chunk->nmap=%d\n", chunk->nmap); + for (j = 0; j < chunk->nmap; ++j) { + sg = &chunk->page_list[j]; + hvnd_debug("sg_dma_len=%d sg_dma_address=%llx\n", + sg_dma_len(sg), sg_dma_address(sg)); + hvnd_debug("page_link=%lx offset=%u length=%u\n", + sg->page_link, sg->offset, sg->length); + len = sg_dma_len(&chunk->page_list[j]) >> shift; + for_each_sg(&chunk->page_list[j], sg, len, entry) { + hvnd_debug("PFN=%lu\n", + page_to_pfn(sg_page(sg))); + } + } + } +#endif +} + + +static struct ib_mr *hvnd_reg_user_mr(struct ib_pd *pd, u64 start, u64 length, + u64 virt, int acc, struct ib_udata *udata) +{ + int err = 0; + struct hvnd_ib_pd *hvndpd = to_nd_pd(pd); + struct hvnd_mr *mr; + + mr = kmalloc(sizeof(*mr), GFP_KERNEL); + if (!mr) + return ERR_PTR(-ENOMEM); + mr->pd = hvndpd; + + mr->umem = ib_umem_get(pd->uobject->context, start, length, acc, 0); + if (IS_ERR(mr->umem)) { + err = PTR_ERR(mr->umem); + hvnd_error("ib_umem_get failed ret=%d\n", err); + kfree(mr); + return ERR_PTR(err); + } + + debug_dump_umem(mr->umem); + + mr->start = start; + mr->length = length; + mr->virt = virt; + mr->acc = acc; + + hvnd_debug("start=%llx length=%llx virt=%llx acc=%d\n", + start, length, virt, acc); + + /* + * First create a memory region. + */ + err = hvnd_cr_mr(to_nd_dev(pd->device), + to_nd_context(pd->uobject->context), hvndpd->handle, + &mr->mr_handle); + if (err) { + hvnd_error("cr_mr failed; ret is %d\n", err); + goto err; + } + + err = hvnd_mr_register(to_nd_dev(pd->device), + to_nd_context(pd->uobject->context), mr); + + if (err) + goto err0; + + hvnd_acquire_uctx_ref(to_nd_context(pd->uobject->context)); + + return &mr->ibmr; + +err0: + hvnd_free_mr(to_nd_dev(pd->device), + to_nd_context(pd->uobject->context), mr->mr_handle); +err: + ib_umem_release(mr->umem); + kfree(mr); + return ERR_PTR(err); +} + + + +static int hvnd_dereg_mr(struct ib_mr *ib_mr) +{ + int ret; + struct hvnd_mr *mr = to_nd_mr(ib_mr); + struct hvnd_ucontext *uctx = to_nd_context(ib_mr->pd->uobject->context); + struct hvnd_dev *nd_dev = to_nd_dev(ib_mr->device); + + + hvnd_debug("dereg_mr entering\n"); + + ret = hvnd_deregister_mr(nd_dev, uctx, mr->mr_handle); + + if (ret) { + hvnd_error("hvnd_deregister_mr() failed: %x\n", ret); + return ret; + } + /* + * Now free up the memory region. + */ + + ret = hvnd_free_mr(nd_dev, uctx, mr->mr_handle); + if (ret) { + hvnd_error("hvnd_free_mr() failed: %x\n", ret); + return ret; + } + + ib_umem_release(mr->umem); + + hvnd_drop_uctx_ref(nd_dev, uctx); + kfree(mr); + + hvnd_debug("dereg_mr done\n"); + return 0; +} + +static struct ib_mw *hvnd_alloc_mw(struct ib_pd *pd, enum ib_mw_type type, + struct ib_udata *udata) +{ + hvnd_info("check code\n"); + return NULL; +} + +static int hvnd_dealloc_mw(struct ib_mw *mw) +{ + debug_check(__func__, __LINE__); + return 0; +} + +static int hvnd_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags) +{ + struct hvnd_ucontext *uctx; + struct hvnd_dev *nd_dev; + struct hvnd_cq *cq; + + cq = to_nd_cq(ibcq); + uctx = cq->uctx; + nd_dev = to_nd_dev(uctx->ibucontext.device); + + + debug_check(__func__, __LINE__); + + return 0; +} + +static int hvnd_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, + struct ib_send_wr **bad_wr) +{ + debug_check(__func__, __LINE__); + return 0; +} + +int hvnd_post_receive(struct ib_qp *ibqp, struct ib_recv_wr *wr, + struct ib_recv_wr **bad_wr) +{ + debug_check(__func__, __LINE__); + return 0; +} + +static int hvnd_resolve_addr(struct sockaddr_in *laddr, + struct sockaddr_in *raddr, + struct if_physical_addr *phys_addrstruct) +{ + int ret; + + phys_addrstruct->length = ETH_ALEN; + ret = hvnd_get_neigh_mac_addr((struct sockaddr *)laddr, + (struct sockaddr *)raddr, + phys_addrstruct->addr); + + hvnd_debug("Dest MAC is %pM\n", phys_addrstruct->addr); + return ret; +} + +static int hvnd_connect(struct iw_cm_id *cm_id, + struct iw_cm_conn_param *conn_param) +{ + int ret = 0; + struct hvnd_dev *nd_dev; + struct hvnd_ep_obj *ep_object; + struct sockaddr_in *raddr = (struct sockaddr_in *)&cm_id->remote_addr; + struct sockaddr_in *laddr = (struct sockaddr_in *)&cm_id->local_addr; + struct hvnd_qp *qp; + struct if_physical_addr phys_addrstruct; + union nd_sockaddr_inet dest_addr; + u64 connector_handle; + union nd_sockaddr_inet addr; + char addr_buf[50]; + + if (cm_id->remote_addr.ss_family != AF_INET) { + hvnd_error("cm_id->remote_addr.ss_family=%d not AF_INET\n", + cm_id->remote_addr.ss_family); + return -ENOSYS; + } + + qp = get_qpp(to_nd_dev(cm_id->device), conn_param->qpn); + + if (!qp) { + hvnd_error("failed to find qp conn_param->qpn=%d\n", + conn_param->qpn); + return -EINVAL; + } + + cm_id->provider_data = qp; + cm_id->add_ref(cm_id); + qp->cm_id = cm_id; + + /* + * Set the read/write limits. + * Can we change the limits on a created QP? Luke? + */ + nd_dev = to_nd_dev(cm_id->device); + ep_object = hvnd_setup_ep(cm_id, ND_CONNECTOR, nd_dev, qp->uctx); + hvnd_debug("active connection: local irp is %d\n", + ep_object->local_irp); + if (!ep_object) { + hvnd_error("hvnd_setup_ep failure\n"); + ret = -ENOMEM; + goto err_limit; + } + + ret = hvnd_cr_connector(nd_dev, qp->uctx, + &connector_handle); + + if (ret) { + hvnd_error("hvnd_cr_connector failure ret=%d\n", ret); + goto err_cr_connector; + } + + hvnd_acquire_uctx_ref(qp->uctx); + ep_object->ep_handle = connector_handle; + ep_object->incoming = false; + qp->connector = ep_object; + + /* + * Bind the local address to the connector. + */ + hvnd_debug("Connect local address is %s\n", + debug_inet_ntoa(laddr->sin_addr, addr_buf)); + + memcpy(&addr.ipv4, laddr, sizeof(struct sockaddr_in)); + hvnd_debug("CONNECT AF %d port %d addr %s\n", + addr.ipv4.sin_family, addr.ipv4.sin_port, + debug_inet_ntoa(addr.ipv4.sin_addr, addr_buf)); + + ret = hvnd_bind_connector(nd_dev, qp->uctx, + connector_handle, + &addr); + + if (ret) { + hvnd_error("hvnd_bind_connector failed ret=%d\n", ret); + goto err_bind_connector; + } + + ret = hvnd_resolve_addr(laddr, raddr, &phys_addrstruct); + if (ret) { + hvnd_error("hvnd_resolve_addr failed ret=%d\n", ret); + goto err_bind_connector; + } + + memcpy(&dest_addr.ipv4, raddr, sizeof(struct sockaddr_in)); + + + + /* + * Now attempt to connect. + */ + + hvnd_debug("About to initiate connection\n"); + + if (!ep_add_work_pending(ep_object)) + goto err_bind_connector; + + ep_object->cm_state = hvnd_cm_connect_received; + ret = hvnd_connector_connect(nd_dev, qp->uctx, + ep_object->ep_handle, + conn_param->ird, conn_param->ord, + conn_param->private_data_len, + (u8 *)conn_param->private_data, + qp->qp_handle, + &phys_addrstruct, &dest_addr, + ep_object); + + if (ret == 0) + return 0; + + ep_object->cm_state = hvnd_cm_idle; + ep_del_work_pending(ep_object); + hvnd_error("hvnd_connector_connect failed ret=%d\n", ret); + +err_bind_connector: + qp->connector = NULL; + hvnd_free_connector(nd_dev, qp->uctx, + connector_handle); + hvnd_drop_uctx_ref(nd_dev, qp->uctx); + +err_cr_connector: + kfree(ep_object); + +err_limit: + cm_id->provider_data = NULL; + qp->cm_id = NULL; + cm_id->rem_ref(cm_id); + return ret; +} + +static int hvnd_accept_cr(struct iw_cm_id *cm_id, + struct iw_cm_conn_param *conn_param) +{ + int ret = 0; + struct hvnd_dev *nd_dev; + struct hvnd_qp *qp; + struct hvnd_ep_obj *connector; + enum ibv_qp_state new_qp_state; + + hvnd_debug("Accepting connection - PASSIVE\n"); + nd_dev = to_nd_dev(cm_id->device); + qp = get_qpp(to_nd_dev(cm_id->device), conn_param->qpn); + + if (!qp) { + hvnd_error("get_qpp failed conn_param->qpn=%d\n", + conn_param->qpn); + return -EINVAL; + } + + + connector = (struct hvnd_ep_obj *)cm_id->provider_data; + qp->connector = connector; + + if (connector == NULL) { + hvnd_error("NULL connector!\n"); + return -EINVAL; + } + hvnd_debug("connector's cm_id is %p caller cm_id=%p\n", + connector->cm_id, cm_id); + + connector->cq = qp->recv_cq; + + + /* + * Setup state for the accepted connection. + */ + cm_id->add_ref(cm_id); + connector->cm_id = cm_id; + if (conn_param == NULL) { + hvnd_error("NULL conn_param!\n"); + return -EINVAL; + } + + connector->ord = conn_param->ord; + connector->ird = conn_param->ird; + + if (!ep_add_work_pending(connector)) + goto error; + + init_completion(&connector->connector_accept_event); + + ret = hvnd_connector_accept(nd_dev, qp->uctx, connector->ep_handle, + qp->qp_handle, conn_param->ird, + conn_param->ord, + conn_param->private_data_len, + conn_param->private_data, + &new_qp_state, connector); + + if (ret) { + ep_del_work_pending(connector); + hvnd_error("connector accept failed\n"); + goto error; + } + + wait_for_completion(&connector->connector_accept_event); + ret = connector->connector_accept_status; + + if (ret) { + hvnd_error("connector_accept failed status=%x\n", ret); + ret = -EIO; + goto error; + } + + hvnd_debug("Passive Connection Accepted; new qp state is %d\n", + new_qp_state); + connector->cm_state = hvnd_cm_accept_sent; + return 0; + +error: + ep_stop(connector); + connector->cm_id = NULL; + connector->cm_state = hvnd_cm_idle; + + qp->connector = NULL; + cm_id->rem_ref(cm_id); + + return ret; +} + +static int hvnd_reject_cr(struct iw_cm_id *cm_id, const void *pdata, + u8 pdata_len) +{ + debug_check(__func__, __LINE__); + return 0; +} + +void hvnd_process_disconnect(struct hvnd_ep_obj *ep_object, int status) +{ + struct iw_cm_event cm_event; + + switch (status) { + case STATUS_SUCCESS: + case STATUS_CANCELLED: + break; + + default: + hvnd_warn("disconnect complete failed: status:%d\n", status); + } + + + hvnd_debug("active disconnect processed\n"); + memset(&cm_event, 0, sizeof(cm_event)); + + complete(&ep_object->disconnect_event); +} + + +void hvnd_process_notify_disconnect(struct hvnd_ep_obj *ep_object, int status) +{ + struct iw_cm_event cm_event; + + // make sure we only disconnect once + if (atomic_xchg(&ep_object->disconnect_notified, 1)) + return; + + /* + * Turn off CQ monitoring. + */ + if (ep_object->cq) + ep_object->cq->monitor = false; + + switch (ep_object->cm_state) { + case hvnd_cm_connect_reply_sent: + case hvnd_cm_established_sent: + case hvnd_cm_accept_sent: + break; + + default: + hvnd_error("unexpected cm_state=%d\n", ep_object->cm_state); + return; + } + + switch (status) { + case STATUS_SUCCESS: + case STATUS_CANCELLED: + case STATUS_DISCONNECTED: + break; + + default: + hvnd_warn("notify disconnect complete failed: status:%d\n", + status); + } + + hvnd_debug("passive disconnect notified\n"); + memset(&cm_event, 0, sizeof(cm_event)); + + /* + * Other end disconnected. + * Connection has been disconnected; + * notify the cm layer. + */ + cm_event.status = -ECONNRESET; + cm_event.event = IW_CM_EVENT_CLOSE; + + if ((ep_object->cm_id) && + (ep_object->cm_id->event_handler)) { + + ep_object->cm_id->event_handler(ep_object->cm_id, &cm_event); + + ep_object->cm_id->rem_ref(ep_object->cm_id); + ep_object->cm_state = hvnd_cm_close_sent; + } +} + +void hvnd_process_connector_accept(struct hvnd_ep_obj *ep_object, int status) +{ + struct iw_cm_event cm_event; + int ret; + + /* + * This is the problem area the return status may be: + * 1: 0xc00000b5 (3221225653) - {Device Timeout} The specified I/O + * operation on %hs was not completed before the time-out period + * expired + * 2: NTSTATUS 0xc0000241 (3221226049) - The transport connection was + * aborted by the local system. + * If we do nothing here, iwcm will wait for IW_CM_EVENT_ESTABLISHED + * forever, and unable to clean shutdown need to fail the call earlier + * on accept. + */ + + ep_object->connector_accept_status = status; + + if (status) { + hvnd_error("Connector accept failed; status is %x\n", status); + complete(&ep_object->connector_accept_event); + return; + } + + memset(&cm_event, 0, sizeof(cm_event)); + cm_event.event = IW_CM_EVENT_ESTABLISHED; + cm_event.ird = ep_object->ird; + cm_event.ord = ep_object->ord; + cm_event.provider_data = (void *)ep_object; + + /* + * We have successfully passively accepted the + * incoming connection. + */ + + hvnd_debug("Passive connection accepted!!\n"); + if ((ep_object->cm_id) && + (ep_object->cm_id->event_handler)) { + ep_object->cm_id->event_handler(ep_object->cm_id, &cm_event); + ep_object->cm_state = hvnd_cm_established_sent; + } + + complete(&ep_object->connector_accept_event); + + /* + * Request notification if the other end + * were to disconnect. + */ + if (!ep_add_work_pending(ep_object)) + return; + + ret = hvnd_connector_notify_disconnect(ep_object->nd_dev, + ep_object->uctx, + ep_object->ep_handle, + ep_object); + + if (ret) { + ep_del_work_pending(ep_object); + hvnd_error("Connector notify disconnect failed; ret: %d\n", + ret); + } +} + + +void hvnd_process_cq_event_pending(struct hvnd_ep_obj *ep_object, + int status) +{ + + struct ib_cq *ibcq; + struct hvnd_cq *cq; + + cq = ep_object->cq; + ibcq = &ep_object->cq->ibcq; + + if (!cq->monitor) + return; + + // call the previous CQ complete + if ((status == STATUS_PENDING) && cq->upcall_pending && + ibcq->comp_handler) { + ibcq->comp_handler(ibcq, ibcq->cq_context); + cq->upcall_pending = false; + hvnd_debug("CQ comp_handler called arm_sn=%d\n", cq->arm_sn); + } + + if ((status != STATUS_PENDING) && ibcq->comp_handler && + ibcq->cq_context) { + ibcq->comp_handler(ibcq, ibcq->cq_context); + hvnd_error("CQ comp_handler called status=%x\n", status); + } +} + +void hvnd_process_cq_event_complete(struct hvnd_ep_obj *ep_object, + int status) +{ + struct ib_cq *ibcq; + struct hvnd_cq *cq; + int ret; + + cq = ep_object->cq; + ibcq = &ep_object->cq->ibcq; + + // call hte previous CQ complete + if (cq->upcall_pending && ibcq->comp_handler) { + ibcq->comp_handler(ibcq, ibcq->cq_context); + cq->upcall_pending = false; + hvnd_debug("CQ comp_handler called arm_sn=%d\n", cq->arm_sn); + } + + cq->upcall_pending = true; + if (!ep_add_work_pending(ep_object)) + return; + + ret = hvnd_notify_cq(ep_object->nd_dev, + ep_object->cq, + ND_CQ_NOTIFY_ANY, + (u64)ep_object); + + if (ret) { + ep_del_work_pending(ep_object); +// hvnd_manage_io_state(ep_object, true); + hvnd_error("hvnd_notify_cq failed ret=%d\n", ret); + } + + if ((status != 0) && (status != STATUS_CANCELLED)) { + if (ibcq->event_handler) { + struct ib_event event; + + event.device = ibcq->device; + event.event = IB_EVENT_CQ_ERR; + event.element.cq = ibcq; + ibcq->event_handler(&event, ibcq->cq_context); + + hvnd_warn("CQ event_handler called status=%x\n", + status); + } + } +} + +int init_cm_event(struct hvnd_ep_obj *ep_object, struct iw_cm_event *cm_event, + int event) +{ + struct sockaddr_in *laddr = (struct sockaddr_in *)&cm_event->local_addr; + struct sockaddr_in *raddr = (struct sockaddr_in *)&cm_event->remote_addr; + struct nd_read_limits rd_limits; + union nd_sockaddr_inet local_addr; + union nd_sockaddr_inet remote_addr; + int ret; + + /* + * Now get the local address. + */ + ret = hvnd_connector_get_local_addr(ep_object->nd_dev, + ep_object->uctx, + ep_object->ep_handle, + &local_addr); + + if (ret) { + hvnd_error("Connector get addr failed; ret: %d\n", ret); + return ret; + } + /* + * Now get the remote address. + */ + ret = hvnd_connector_get_peer_addr(ep_object->nd_dev, + ep_object->uctx, + ep_object->ep_handle, + &remote_addr); + + if (ret) { + hvnd_error("Connector get peer addr failed; ret: %d\n", ret); + return ret; + } + + /* + * Get other connection parameters. + */ + + ret = hvnd_connector_get_rd_limits(ep_object->nd_dev, + ep_object->uctx, + ep_object->ep_handle, + &rd_limits); + + if (ret) { + hvnd_error("Connector rd limits failed; ret: %d\n", ret); + return ret; + } + + /* + * XXXKYS: Luke: What about the length of the priv data? + */ + ret = hvnd_connector_get_priv_data(ep_object->nd_dev, + ep_object->uctx, + ep_object->ep_handle, + ep_object->priv_data); + + if (ret) { + hvnd_error("Connector get priv data failed; ret: %d\n", ret); + return ret; + } + /* + * Initialize CM structure. + */ + laddr->sin_addr.s_addr = local_addr.ipv4.sin_addr.s_addr; + hvnd_debug("Local addr is %d\n", laddr->sin_addr.s_addr); + laddr->sin_port = local_addr.ipv4.sin_port; + laddr->sin_family = AF_INET; + + raddr->sin_addr.s_addr = remote_addr.ipv4.sin_addr.s_addr; + hvnd_debug("Remote addr is %d\n", raddr->sin_addr.s_addr); + raddr->sin_port = remote_addr.ipv4.sin_port; + raddr->sin_family = AF_INET; + + // KYS; LUke: is it always 148 bytes? + cm_event->private_data_len = MAX_PRIVATE_DATA_LEN; + cm_event->private_data = ep_object->priv_data; + + cm_event->ird = rd_limits.inbound; + cm_event->ord = rd_limits.outbound; + cm_event->event = event; + + ep_object->ird = cm_event->ird; + ep_object->ord = cm_event->ord; + + return 0; +} + +static void hvnd_refuse_connection(struct hvnd_ep_obj *connector, int status) +{ + struct iw_cm_event cm_event; + + memset(&cm_event, 0, sizeof(cm_event)); + + cm_event.event = IW_CM_EVENT_CONNECT_REPLY; + cm_event.status = status; + + hvnd_debug("returning status %d on connector %p\n", status, connector); + + if (connector->cm_id && connector->cm_id->event_handler) { + connector->cm_id->event_handler(connector->cm_id, &cm_event); + connector->cm_id->rem_ref(connector->cm_id); + connector->cm_state = hvnd_cm_connect_reply_refused; + } +} + +void hvnd_process_events(struct work_struct *work) +{ + struct hvnd_work *wrk; + struct nd_read_limits rd_limits; + struct hvnd_ep_obj *ep_object; + struct hvnd_ep_obj *parent; + struct iw_cm_event cm_event; + struct sockaddr_in *laddr = (struct sockaddr_in *)&cm_event.local_addr; + struct sockaddr_in *raddr = (struct sockaddr_in *)&cm_event.remote_addr; + struct ndv_packet_hdr_control_1 *ctrl_hdr; + union nd_sockaddr_inet local_addr; + union nd_sockaddr_inet remote_addr; + struct pkt_nd_get_connection_listener *connection_pkt; + struct iw_cm_id *cm_id = NULL; + int status; + int ioctl; + int ret; + char priv_data[MAX_PRIVATE_DATA_LEN]; + enum ibv_qp_state new_qp_state; + struct incoming_pkt *incoming_pkt; + unsigned long flags; + + memset(&cm_event, 0, sizeof(cm_event)); + memset(&priv_data, 0, MAX_PRIVATE_DATA_LEN); + + + wrk = container_of(work, struct hvnd_work, work); + + /* + * Now call into the connection manager. + */ + ep_object = (struct hvnd_ep_obj *)wrk->callback_arg; + parent = ep_object->parent; + +process_next: + incoming_pkt = NULL; + spin_lock_irqsave(&ep_object->incoming_pkt_list_lock, flags); + if (!list_empty(&ep_object->incoming_pkt_list)) { + incoming_pkt = list_first_entry(&ep_object->incoming_pkt_list, + struct incoming_pkt, + list_entry); + list_del(&incoming_pkt->list_entry); + } + spin_unlock_irqrestore(&ep_object->incoming_pkt_list_lock, flags); + if (incoming_pkt == NULL) + return; + + ctrl_hdr = (struct ndv_packet_hdr_control_1 *)incoming_pkt->pkt; + status = ctrl_hdr->io_status; + ioctl = ctrl_hdr->io_cntrl_code; + + hvnd_debug("Process Events IOCTL is: %s; iostatus failure: %x in work queue\n", + hvnd_get_op_name(ioctl), status); + + if (status != 0) { + bool log_error = true; + + if (ioctl == IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT && + status == STATUS_DISCONNECTED) // expected + log_error = false; + + if (log_error) + hvnd_warn("Process Events IOCTL is: %s; iostatus failure: %x\n", + hvnd_get_op_name(ioctl), status); + } + + cm_event.status = status; + + switch (ep_object->type) { + case ND_CONNECTOR: + switch (ioctl) { + case IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST: + + if (ep_object->parent != NULL) { + + // Do nothing with this connection request if + // listener is stopping + if (!ep_add_work_pending(ep_object->parent)) + break; + + cm_id = ep_object->parent->cm_id; //Listener + } + + connection_pkt = (struct pkt_nd_get_connection_listener *) ctrl_hdr; + + if ((status == 0) || (status == STATUS_CANCELLED)) { + hvnd_get_incoming_connections(ep_object->parent, + ep_object->parent->nd_dev, + ep_object->uctx); + } + + if (status) + goto get_connection_request_done; + + /* + * Now get the local address. + */ + ret = hvnd_connector_get_local_addr(ep_object->nd_dev, + ep_object->uctx, + ep_object->ep_handle, + &local_addr); + + if (ret) { + hvnd_error("Connector get addr failed; ret: %d\n", + ret); + goto get_connection_request_done; + } + /* + * Now get the remote address. + */ + ret = hvnd_connector_get_peer_addr(ep_object->nd_dev, + ep_object->uctx, + ep_object->ep_handle, + &remote_addr); + + if (ret) { + hvnd_error("Connector get peer addr failed; ret: %d\n", + ret); + goto get_connection_request_done; + } + /* + * Get other connection parameters. + */ + + ret = hvnd_connector_get_rd_limits(ep_object->nd_dev, + ep_object->uctx, + ep_object->ep_handle, + &rd_limits); + + if (ret) { + hvnd_error("Connector rd imits failed; ret: %d\n", + ret); + goto get_connection_request_done; + } + + /* + * XXXKYS: Luke: What about the length of the priv data? + */ + ret = hvnd_connector_get_priv_data( + ep_object->nd_dev, ep_object->uctx, + ep_object->ep_handle, ep_object->priv_data); + + if (ret) { + hvnd_error("Connector get priv data failed; ret: %d\n", + ret); + goto get_connection_request_done; + } + + cm_event.event = IW_CM_EVENT_CONNECT_REQUEST; + cm_event.provider_data = (void *)ep_object; + + laddr->sin_addr.s_addr = local_addr.ipv4.sin_addr.s_addr; + hvnd_debug("Local addr is %d\n", + laddr->sin_addr.s_addr); + laddr->sin_port = local_addr.ipv4.sin_port; + laddr->sin_family = AF_INET; + + raddr->sin_addr.s_addr = remote_addr.ipv4.sin_addr.s_addr; + hvnd_debug("Remote addr is %d\n", + raddr->sin_addr.s_addr); + raddr->sin_port = remote_addr.ipv4.sin_port; + raddr->sin_family = AF_INET; + + // KYS; LUke: is it always 148 bytes? + cm_event.private_data_len = MAX_PRIVATE_DATA_LEN; + cm_event.private_data = ep_object->priv_data; + + cm_event.ird = rd_limits.inbound; + cm_event.ord = rd_limits.outbound; + + ep_object->ird = cm_event.ird; + ep_object->ord = cm_event.ord; + + + if ((cm_id != NULL) && cm_id->event_handler) { + cm_id->event_handler(cm_id, &cm_event); + ep_object->cm_state = hvnd_cm_connect_request_sent; + } + +get_connection_request_done: + if (ep_object->parent != NULL) + ep_del_work_pending(ep_object->parent); + break; + + case IOCTL_ND_CONNECTOR_CONNECT: + + cm_event.event = IW_CM_EVENT_CONNECT_REPLY; + if (status == STATUS_TIMEOUT && + ep_object->connector_connect_retry < 3) { + //TIMEOUT retry + + if (!ep_add_work_pending(ep_object)) + goto refuse_connection; + + hvnd_warn("Connector connect timed out, reconnecting... retry count: %d\n", + ep_object->connector_connect_retry); + ep_object->connector_connect_retry++; + ret = hvnd_send_ioctl_pkt(ep_object->nd_dev, + &ep_object->connector_connect_pkt.hdr, + sizeof(ep_object->connector_connect_pkt), + (u64)&ep_object->connector_connect_pkt); + + if (ret) { + hvnd_error("Connector on time out failed: %d\n", + ret); + ep_del_work_pending(ep_object); + goto refuse_connection; + } + break; + } + +refuse_connection: + if (status) { + cm_event.status = -ECONNREFUSED; + if (status == STATUS_TIMEOUT) + cm_event.status = -ETIMEDOUT; + + hvnd_refuse_connection(ep_object, + cm_event.status); + break; + } + + hvnd_debug("ACTIVE Connection ACCEPTED\n"); + ret = init_cm_event(ep_object, &cm_event, + IW_CM_EVENT_CONNECT_REPLY); + if (ret) { + hvnd_error("init_cm_event failed ret=%d\n", + ret); + goto process_done; + } + + ret = hvnd_connector_complete_connect(ep_object->nd_dev, + ep_object->uctx, + ep_object->ep_handle, + &new_qp_state); + if (ret) { + hvnd_error("connector_complete failed\n"); + goto process_done; + } + + cm_event.provider_data = (void *)ep_object; + + if ((ep_object->cm_id) && + (ep_object->cm_id->event_handler)) { + ep_object->cm_id->event_handler( + ep_object->cm_id, &cm_event); + ep_object->cm_state = hvnd_cm_connect_reply_sent; + } + /* + * Rquest notification if the other end + * were to disconnect. + */ + if (!ep_add_work_pending(ep_object)) + goto process_done; + + ret = hvnd_connector_notify_disconnect( + ep_object->nd_dev, ep_object->uctx, + ep_object->ep_handle, ep_object); + + if (ret) { + ep_del_work_pending(ep_object); + hvnd_error("Connector notify disconnect failed; ret: %d\n", + ret); + } + + break; + + case IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT: + hvnd_process_notify_disconnect(ep_object, status); + break; + + + default: + hvnd_error("Unknown Connector IOCTL\n"); + break; + } + break; + default: + hvnd_error("Unknown endpoint object\n"); + break; + } +process_done: + kfree(incoming_pkt); + ep_del_work_pending(ep_object); + + goto process_next; +} + + +static struct hvnd_ep_obj *hvnd_setup_ep(struct iw_cm_id *cm_id, int ep_type, + struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx) +{ + struct hvnd_ep_obj *ep_object; + int ret; + + ep_object = kzalloc(sizeof(struct hvnd_ep_obj), GFP_KERNEL); + + if (!ep_object) + return NULL; + + ret = hvnd_init_ep(ep_object, cm_id, ep_type, nd_dev, uctx); + + if (ret) { + hvnd_error("hvnd_init_ep failed ret=%d\n", ret); + kfree(ep_object); + return NULL; + } + + return ep_object; +} + +/* + * return value: + * true: I/O state is stopped, we should not do upcall + * flase: I/O state is running and normal + */ +#if 0 +static bool hvnd_manage_io_state(struct hvnd_ep_obj *ep, bool failure) +{ + unsigned long flags; + + spin_lock_irqsave(&ep->ep_lk, flags); + if (ep->to_be_destroyed) { + hvnd_warn("ep being destroyed\n"); + if (ep->io_outstanding) { + hvnd_warn("ep being destroyed i/O pending waking up on %p\n", + &ep->block_event); + complete(&ep->block_event); + ep->io_outstanding = false; + } + spin_unlock_irqrestore(&ep->ep_lk, flags); + return true; + } + if (!failure) + ep->io_outstanding = true; + spin_unlock_irqrestore(&ep->ep_lk, flags); + return false; +} +#endif + +static int hvnd_get_incoming_connections(struct hvnd_ep_obj *listener, + struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx) +{ + struct hvnd_ep_obj *connector; + u64 connector_handle; + int ret; + + /* + * First handle the protocol for + * destruction - outstanding I/O. + */ + +// if (hvnd_manage_io_state(listener, false)) +// return 0; + /* + * Create a connector. + */ + connector = hvnd_setup_ep(listener->cm_id, ND_CONNECTOR, nd_dev, uctx); + if (!connector) { + hvnd_error("hvnd_setup_ep failed\n"); + ret = -ENOMEM; + goto con_alloc_err; + } + + ret = hvnd_cr_connector(nd_dev, uctx, + &connector_handle); + if (ret) { + hvnd_error("hvnd_cr_connector failed ret=%d\n", ret); + goto con_cr_err; + } + + /* + * Now get a connection if one is pending. + */ + connector->ep_handle = connector_handle; + connector->parent = listener; + + if (!ep_add_work_pending(connector)) + goto get_connection_err; + + ret = hvnd_get_connection_listener(nd_dev, uctx, + listener->ep_handle, + connector_handle, + (u64)connector); + + if (ret) { + hvnd_debug("listener_get_connection failed\n"); + ep_del_work_pending(connector); + goto get_connection_err; + } + + hvnd_acquire_uctx_ref(uctx); + listener->outstanding_handle = connector_handle; + listener->outstanding_ep = connector; + hvnd_debug("outstanding handle is %p\n", (void *)connector_handle); + return 0; + +get_connection_err: + hvnd_free_handle(nd_dev, uctx, + connector_handle, + IOCTL_ND_CONNECTOR_FREE); + +con_cr_err: + kfree(connector); +con_alloc_err: +// hvnd_manage_io_state(listener, true); + return ret; +} + +static int hvnd_create_listen(struct iw_cm_id *cm_id, int backlog) +{ + int ret = 0; + struct hvnd_dev *nd_dev; + struct hvnd_ucontext *uctx; + struct hvnd_ep_obj *ep_object; + union nd_sockaddr_inet addr; + union nd_sockaddr_inet local_addr; + u64 listener_handle; + struct sockaddr_in *laddr = (struct sockaddr_in *)&cm_id->local_addr; + union nd_sockaddr_inet og_addr; + + + nd_dev = to_nd_dev(cm_id->device); + uctx = get_uctx(nd_dev, current_pid()); + hvnd_debug("uctx is %p; pid is %d\n", uctx, current_pid()); + + if (cm_id->local_addr.ss_family != AF_INET) { + hvnd_error("cm_id->local_addr.ss_family =%d not AF_INET\n", + cm_id->local_addr.ss_family); + return -ENOSYS; + } + + /* + * If the local address is LOOPBACK or INADDR_ANY, get an an address + * to bind the listener. For now, just get the first address + * available. + */ + + if (IN_LOOPBACK(ntohl(laddr->sin_addr.s_addr)) || + (laddr->sin_addr.s_addr == INADDR_ANY)) { + + hvnd_debug("need to get an address\n"); + ret = hvnd_get_outgoing_rdma_addr(nd_dev, uctx, &og_addr); + + if (ret) { + hvnd_error("failed to get the og address\n"); + return ret; + } + + laddr->sin_addr.s_addr = og_addr.ipv4.sin_addr.s_addr; + } + + cm_id->add_ref(cm_id); + + ep_object = hvnd_setup_ep(cm_id, ND_LISTENER, nd_dev, uctx); + + if (!ep_object) { + hvnd_error("hvnd_setup_ep returned NULL\n"); + goto alloc_err; + } + + ret = hvnd_cr_listener(nd_dev, uctx, + &listener_handle); + if (ret) { + hvnd_error("hvnd_cr_listener failed ret=%d\n", ret); + goto cr_err; + } + + ep_object->ep_handle = listener_handle; + + cm_id->provider_data = ep_object; + + /* + * Now bind the listener. + * IPV4 support only. + */ + memcpy(&addr.ipv4, laddr, sizeof(struct sockaddr_in)); + + ret = hvnd_bind_listener(nd_dev, uctx, + listener_handle, + &addr); + if (ret) { + hvnd_error("hvnd_bind_listener failed ret=%d\n", ret); + goto bind_err; + } + + /* + * Now get the local address. + */ + ret = hvnd_get_addr_listener(nd_dev, uctx, + listener_handle, + &local_addr); + if (ret) { + hvnd_error("hvnd_get_addr_listener failed ret=%d\n", ret); + goto bind_err; + } + + /* + * Now put the listener in the listen mode. + */ + + ret = hvnd_listen_listener(nd_dev, uctx, + listener_handle, + backlog); + + if (ret) { + hvnd_error("hvnd_listen_listener failed ret=%d\n", ret); + goto bind_err; + } + + + /* + * Now get a pending connection if one is pending. + */ + ret = hvnd_get_incoming_connections(ep_object, nd_dev, uctx); + if (ret) { + hvnd_error("hvnd_get_incoming_connections failed ret=%d\n", + ret); + goto bind_err; + } + + hvnd_acquire_uctx_ref(uctx); + hvnd_debug("cm_id=%p\n", cm_id); + return 0; + +bind_err: + hvnd_free_handle(nd_dev, uctx, + listener_handle, + IOCTL_ND_LISTENER_FREE); +cr_err: + kfree(ep_object); +alloc_err: + cm_id->provider_data = NULL; + cm_id->rem_ref(cm_id); + return ret; +} + +static int hvnd_destroy_listen(struct iw_cm_id *cm_id) +{ + struct hvnd_dev *nd_dev; + struct hvnd_ucontext *uctx; + struct hvnd_ep_obj *ep_object; + + nd_dev = to_nd_dev(cm_id->device); + + ep_object = (struct hvnd_ep_obj *)cm_id->provider_data; + + hvnd_debug("uctx is %p\n", ep_object->uctx); + hvnd_debug("Destroying Listener cm_id=%p\n", cm_id); + uctx = ep_object->uctx; + + // make sure there is nothing in progress on this ep + ep_stop(ep_object); + + hvnd_free_handle(nd_dev, uctx, + ep_object->ep_handle, + IOCTL_ND_LISTENER_FREE); + + /* + * We may have an ouststanding connector for + * incoming connection requests; clean it up. + */ + + if (ep_object->outstanding_handle != 0) { + + // make sure there is nothing in progress on this ep + ep_stop(ep_object->outstanding_ep); + + hvnd_free_handle(nd_dev, uctx, + ep_object->outstanding_handle, + IOCTL_ND_CONNECTOR_FREE); + + + hvnd_drop_uctx_ref(nd_dev, uctx); + hvnd_destroy_ep(ep_object->outstanding_ep); + } + + /* + * Now everything should have stopped + */ + + cm_id->rem_ref(cm_id); + hvnd_destroy_ep(ep_object); + cm_id->provider_data = NULL; + hvnd_drop_uctx_ref(nd_dev, uctx); + + hvnd_debug("cm_id=%p\n", cm_id); + return 0; +} + +static void hvnd_qp_add_ref(struct ib_qp *ibqp) +{ + struct hvnd_qp *qp = to_nd_qp(ibqp); + + atomic_inc(&qp->refcnt); +} + +void hvnd_qp_rem_ref(struct ib_qp *ibqp) +{ + struct hvnd_qp *qp = to_nd_qp(ibqp); + + if (atomic_dec_and_test(&qp->refcnt)) + wake_up(&qp->wait); +} + +static DEVICE_ATTR(hw_rev, 0444, hvnd_show_rev, NULL); +static DEVICE_ATTR(fw_ver, 0444, hvnd_show_fw_ver, NULL); +static DEVICE_ATTR(hca_type, 0444, hvnd_show_hca, NULL); +static DEVICE_ATTR(board_id, 0444, hvnd_show_board, NULL); + +static struct device_attribute *hvnd_class_attributes[] = { + &dev_attr_hw_rev, + &dev_attr_fw_ver, + &dev_attr_hca_type, + &dev_attr_board_id, +}; + +int hvnd_register_device(struct hvnd_dev *dev, char *ip_addr, char *mac_addr) +{ + int ret; + + dev->ibdev.owner = THIS_MODULE; + dev->device_cap_flags = IB_DEVICE_LOCAL_DMA_LKEY | IB_DEVICE_MEM_WINDOW; + dev->ibdev.local_dma_lkey = 0; + dev->ibdev.uverbs_cmd_mask = + (1ull << IB_USER_VERBS_CMD_GET_CONTEXT) | + (1ull << IB_USER_VERBS_CMD_QUERY_DEVICE) | + (1ull << IB_USER_VERBS_CMD_QUERY_PORT) | + (1ull << IB_USER_VERBS_CMD_ALLOC_PD) | + (1ull << IB_USER_VERBS_CMD_DEALLOC_PD) | + (1ull << IB_USER_VERBS_CMD_REG_MR) | + (1ull << IB_USER_VERBS_CMD_DEREG_MR) | + (1ull << IB_USER_VERBS_CMD_CREATE_COMP_CHANNEL) | + (1ull << IB_USER_VERBS_CMD_CREATE_CQ) | + (1ull << IB_USER_VERBS_CMD_DESTROY_CQ) | + (1ull << IB_USER_VERBS_CMD_REQ_NOTIFY_CQ) | + (1ull << IB_USER_VERBS_CMD_CREATE_QP) | + (1ull << IB_USER_VERBS_CMD_MODIFY_QP) | + (1ull << IB_USER_VERBS_CMD_QUERY_QP) | + (1ull << IB_USER_VERBS_CMD_POLL_CQ) | + (1ull << IB_USER_VERBS_CMD_DESTROY_QP) | + (1ull << IB_USER_VERBS_CMD_POST_SEND) | + (1ull << IB_USER_VERBS_CMD_POST_RECV); + dev->ibdev.node_type = RDMA_NODE_RNIC; + memcpy(dev->ibdev.node_desc, HVND_NODE_DESC, sizeof(HVND_NODE_DESC)); + memcpy(&dev->ibdev.node_guid, mac_addr, 6); + dev->ibdev.phys_port_cnt = 1; //dev->nports; + dev->ibdev.num_comp_vectors = 1; + dev->ibdev.query_device = hvnd_query_device; + dev->ibdev.query_port = hvnd_query_port; + dev->ibdev.get_link_layer = hvnd_get_link_layer; + dev->ibdev.query_pkey = hvnd_query_pkey; + dev->ibdev.query_gid = hvnd_query_gid; + dev->ibdev.alloc_ucontext = hvnd_alloc_ucontext; + dev->ibdev.dealloc_ucontext = hvnd_dealloc_ucontext; + dev->ibdev.mmap = hvnd_mmap; + dev->ibdev.alloc_pd = hvnd_allocate_pd; + dev->ibdev.dealloc_pd = hvnd_deallocate_pd; + dev->ibdev.create_ah = hvnd_ah_create; + dev->ibdev.destroy_ah = hvnd_ah_destroy; + dev->ibdev.create_qp = hvnd_ib_create_qp; + dev->ibdev.modify_qp = hvnd_ib_modify_qp; + dev->ibdev.query_qp = hvnd_ib_query_qp; + dev->ibdev.destroy_qp = hvnd_destroy_qp; + dev->ibdev.create_cq = hvnd_ib_create_cq; + dev->ibdev.destroy_cq = hvnd_ib_destroy_cq; + dev->ibdev.resize_cq = hvnd_resize_cq; + dev->ibdev.poll_cq = hvnd_poll_cq; + dev->ibdev.get_dma_mr = hvnd_get_dma_mr; + dev->ibdev.reg_user_mr = hvnd_reg_user_mr; + dev->ibdev.dereg_mr = hvnd_dereg_mr; + dev->ibdev.alloc_mw = hvnd_alloc_mw; + dev->ibdev.dealloc_mw = hvnd_dealloc_mw; + dev->ibdev.attach_mcast = hvnd_multicast_attach; + dev->ibdev.detach_mcast = hvnd_multicast_detach; + dev->ibdev.process_mad = hvnd_process_mad; + dev->ibdev.req_notify_cq = hvnd_arm_cq; + dev->ibdev.post_send = hvnd_post_send; + dev->ibdev.post_recv = hvnd_post_receive; + dev->ibdev.uverbs_abi_ver = MLX4_IB_UVERBS_ABI_VERSION; + + dev->ibdev.get_port_immutable = hvnd_get_port_immutable; + + //DMA ops for mapping all possible addresses + dev->ibdev.dev.parent = &(dev->hvdev->device); + dev->ibdev.dev.dma_ops = &vmbus_dma_ops; + dev->ibdev.dev.dma_mask = (u64 *) DMA_BIT_MASK(64); + dev->ibdev.dev.coherent_dma_mask = DMA_BIT_MASK(64); + + dev->ibdev.iwcm = kmalloc(sizeof(struct iw_cm_verbs), GFP_KERNEL); + if (!dev->ibdev.iwcm) + return -ENOMEM; + + dev->ibdev.iwcm->connect = hvnd_connect; + dev->ibdev.iwcm->accept = hvnd_accept_cr; + dev->ibdev.iwcm->reject = hvnd_reject_cr; + dev->ibdev.iwcm->create_listen = hvnd_create_listen; + dev->ibdev.iwcm->destroy_listen = hvnd_destroy_listen; + dev->ibdev.iwcm->add_ref = hvnd_qp_add_ref; + dev->ibdev.iwcm->rem_ref = hvnd_qp_rem_ref; + dev->ibdev.iwcm->get_qp = hvnd_get_qp; + + strlcpy(dev->ibdev.name, "mlx4_%d", IB_DEVICE_NAME_MAX); + ret = ib_register_device(&dev->ibdev, NULL); + if (ret) { + hvnd_error("ib_register_device failed ret=%d\n", ret); + goto bail1; + } + +#if 0 + for (i = 0; i < ARRAY_SIZE(hvnd_class_attributes); ++i) { + ret = device_create_file(&dev->ibdev.dev, + hvnd_class_attributes[i]); + if (ret) { + hvnd_error("device_create_file failed ret=%d\n", ret); + goto bail2; + } + } +#endif + + dev->ib_active = true; + return 0; +//bail2: +// ib_unregister_device(&dev->ibdev); +bail1: + kfree(dev->ibdev.iwcm); + return ret; +} + +void hvnd_unregister_device(struct hvnd_dev *dev) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(hvnd_class_attributes); ++i) + device_remove_file(&dev->ibdev.dev, + hvnd_class_attributes[i]); + ib_unregister_device(&dev->ibdev); + kfree(dev->ibdev.iwcm); + ib_dealloc_device((struct ib_device *)dev); +} + +static int hvnd_try_bind_nic(unsigned char *mac, __be32 ip) +{ + int ret; + struct hvnd_dev *nd_dev = g_nd_dev; + + mutex_lock(&nd_dev->bind_mutex); + if (nd_dev->bind_complete) { + mutex_unlock(&nd_dev->bind_mutex); + return 1; + } + + memcpy(nd_dev->mac_addr, mac, 6); + *(__be32 *)(nd_dev->ip_addr) = ip; + + /* + * Bind the NIC. + */ + hvnd_info("trying to bind to IP %pI4 MAC %pM\n", + nd_dev->ip_addr, nd_dev->mac_addr); + ret = hvnd_bind_nic(nd_dev, false, nd_dev->ip_addr, nd_dev->mac_addr); + if (ret || nd_dev->bind_pkt.pkt_hdr.status) { + mutex_unlock(&nd_dev->bind_mutex); + return 1; + } + + /* if we reach here, this means bind_nic is a success */ + hvnd_error("successfully bound to IP %pI4 MAC %pM\n", + nd_dev->ip_addr, nd_dev->mac_addr); + complete(&nd_dev->addr_set); + nd_dev->bind_complete = 1; + mutex_unlock(&nd_dev->bind_mutex); + + ret = hvnd_register_device(nd_dev, nd_dev->ip_addr, nd_dev->mac_addr); + + if (!ret) + return 0; + + hvnd_error("hvnd_register_device failed ret=%d\n", ret); + + /* roll back all allocated resources on error */ + iounmap(nd_dev->mmio_virt); + release_resource(&nd_dev->mmio_resource); + + vmbus_close(nd_dev->hvdev->channel); + ib_dealloc_device((struct ib_device *)nd_dev); + + return 1; +} + +static void hvnd_inetaddr_event_up(unsigned long event, struct in_ifaddr *ifa) +{ + hvnd_try_bind_nic(ifa->ifa_dev->dev->dev_addr, ifa->ifa_address); +} + +static int hvnd_inetaddr_event(struct notifier_block *notifier, + unsigned long event, void *ptr) +{ + struct in_ifaddr *ifa = ptr; + + switch (event) { + case NETDEV_UP: + hvnd_inetaddr_event_up(event, ifa); + break; + default: + hvnd_debug("Received inetaddr event %lu\n", event); + } + + return NOTIFY_DONE; +} + +static struct notifier_block hvnd_inetaddr_notifier = { + .notifier_call = hvnd_inetaddr_event, +}; + +static int start_bind_nic(void) +{ + struct net_device *dev; + struct in_device *idev; + struct in_ifaddr *ifa; + + register_inetaddr_notifier(&hvnd_inetaddr_notifier); + + rtnl_lock(); + for_each_netdev(&init_net, dev) { + idev = in_dev_get(dev); + if (!idev) + continue; + for (ifa = (idev)->ifa_list; + ifa && !(ifa->ifa_flags&IFA_F_SECONDARY); + ifa = ifa->ifa_next) { + hvnd_try_bind_nic(dev->dev_addr, ifa->ifa_address); + } + } + rtnl_unlock(); + + return 0; +} + +static int hvnd_probe(struct hv_device *dev, + const struct hv_vmbus_device_id *dev_id) +{ + struct hvnd_dev *nd_dev; + int ret = 0; + + hvnd_debug("hvnd starting\n"); + + nd_dev = (struct hvnd_dev *)ib_alloc_device(sizeof(struct hvnd_dev)); + if (!nd_dev) { + ret = -ENOMEM; + goto err_out0; + } + + nd_dev->hvdev = dev; + /* + * We are going to masquerade as MLX4 device; + * Set the vendor and device ID accordingly. + */ + dev->vendor_id = 0x15b3; //Mellanox + dev->device_id = 0x1003; //Mellanox HCA + INIT_LIST_HEAD(&nd_dev->listentry); + spin_lock_init(&nd_dev->uctxt_lk); + nd_dev->ib_active = false; + + /* + * Initialize the state for the id table. + */ + spin_lock_init(&nd_dev->id_lock); + idr_init(&nd_dev->cqidr); + idr_init(&nd_dev->qpidr); + idr_init(&nd_dev->mmidr); + idr_init(&nd_dev->irpidr); + idr_init(&nd_dev->uctxidr); + + atomic_set(&nd_dev->open_cnt, 0); + + sema_init(&nd_dev->query_pkt_sem, 1); + + ret = vmbus_open(dev->channel, HVND_RING_SZ, HVND_RING_SZ, NULL, 0, + hvnd_callback, dev); + + if (ret) { + hvnd_error("vmbus_open failed ret=%d\n", ret); + goto err_out1; + } + + hv_set_drvdata(dev, nd_dev); + + ret = hvnd_negotiate_version(nd_dev); + + if (ret) { + hvnd_error("hvnd_negotiate_version failed ret=%d\n", ret); + goto err_out2; + } + + /* + * Register resources with the host. + */ + ret = hvnd_init_resources(nd_dev); + if (ret) { + hvnd_error("hvnd_init_resources failed ret=%d\n", ret); + goto err_out2; + } + + /* + * Try to bind every NIC to ND channel, + * ND host will only return success for the correct one + */ + nd_dev->bind_complete = 0; + mutex_init(&nd_dev->bind_mutex); + init_completion(&nd_dev->addr_set); + + g_nd_dev = nd_dev; + start_bind_nic(); + + return 0; + +err_out2: + vmbus_close(dev->channel); + +err_out1: + ib_dealloc_device((struct ib_device *)nd_dev); + +err_out0: + return ret; +} + +static int hvnd_remove(struct hv_device *dev) +{ + struct hvnd_dev *nd_dev = hv_get_drvdata(dev); + + unregister_inetaddr_notifier(&hvnd_inetaddr_notifier); + hvnd_bind_nic(nd_dev, true, nd_dev->ip_addr, nd_dev->mac_addr); + hvnd_unregister_device(nd_dev); + vmbus_close(dev->channel); + iounmap(nd_dev->mmio_virt); + release_resource(&nd_dev->mmio_resource); + return 0; +} + +static const struct hv_vmbus_device_id id_table[] = { + /* VMBUS RDMA class guid */ + /* 8c2eaf3d-32a7-4b09-ab99-bd1f1c86b501 */ + { HV_ND_GUID, }, + { }, +}; + +MODULE_DEVICE_TABLE(vmbus, id_table); + +static struct hv_driver hvnd_drv = { + .name = "hv_guest_rdma", + .id_table = id_table, + .probe = hvnd_probe, + .remove = hvnd_remove, +}; + + +static int __init init_hvnd_drv(void) +{ + + pr_info("Registered HyperV networkDirect Driver\n"); + hvnd_addr_init(); + return vmbus_driver_register(&hvnd_drv); +} + +static void exit_hvnd_drv(void) +{ + pr_info("De-Registered HyperV networkDirect Driver\n"); + hvnd_addr_deinit(); + vmbus_driver_unregister(&hvnd_drv); +} + + +module_init(init_hvnd_drv); +module_exit(exit_hvnd_drv); + +MODULE_DESCRIPTION("Hyper-V NetworkDirect Driver"); +MODULE_LICENSE("GPL"); +MODULE_VERSION("4.1.3"); diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/user.h b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/user.h new file mode 100644 index 000000000000..07e6769ef43b --- /dev/null +++ b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/user.h @@ -0,0 +1,107 @@ +/* + * Copyright (c) 2007 Cisco Systems, Inc. All rights reserved. + * Copyright (c) 2007, 2008 Mellanox Technologies. All rights reserved. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + */ + +#ifndef MLX4_IB_USER_H +#define MLX4_IB_USER_H + +#include <linux/types.h> + +/* + * Increment this value if any changes that break userspace ABI + * compatibility are made. + */ + +#define MLX4_IB_UVERBS_NO_DEV_CAPS_ABI_VERSION 3 +#define MLX4_IB_UVERBS_ABI_VERSION 4 + +/* + * Make sure that all structs defined in this file remain laid out so + * that they pack the same way on 32-bit and 64-bit architectures (to + * avoid incompatibility between 32-bit userspace and 64-bit kernels). + * In particular do not use pointer types -- pass pointers in __u64 + * instead. + */ + +struct mlx4_ib_alloc_ucontext_resp_v3 { + __u32 qp_tab_size; + __u16 bf_reg_size; + __u16 bf_regs_per_page; +}; + +struct mlx4_ib_alloc_ucontext_resp { + __u32 dev_caps; + __u32 qp_tab_size; + __u16 bf_reg_size; + __u16 bf_regs_per_page; + __u32 cqe_size; +}; + +struct mlx4_ib_alloc_pd_resp { + __u32 pdn; + __u32 reserved; +}; + +struct mlx4_ib_create_cq { + __u64 buf_addr; + __u64 db_addr; +}; + +struct mlx4_ib_create_cq_resp { + __u32 cqn; + __u32 reserved; +}; + +struct mlx4_ib_resize_cq { + __u64 buf_addr; +}; + +struct mlx4_ib_create_srq { + __u64 buf_addr; + __u64 db_addr; +}; + +struct mlx4_ib_create_srq_resp { + __u32 srqn; + __u32 reserved; +}; + +struct mlx4_ib_create_qp { + __u64 buf_addr; + __u64 db_addr; + __u8 log_sq_bb_count; + __u8 log_sq_stride; + __u8 sq_no_prefetch; + __u8 reserved[5]; +}; + +#endif /* MLX4_IB_USER_H */ diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/vmbus_rdma.c b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/vmbus_rdma.c new file mode 100644 index 000000000000..daf68dcccb1b --- /dev/null +++ b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/vmbus_rdma.c @@ -0,0 +1,3074 @@ +/* + * Copyright (c) 2014, Microsoft Corporation. + * + * Author: + * K. Y. Srinivasan <kys@microsoft.com> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or + * NON INFRINGEMENT. See the GNU General Public License for more + * details. + * + * Bug fixes/enhancements: Long Li <longli@microsoft.com> + */ + +#include <linux/completion.h> +#include <linux/module.h> +#include <linux/errno.h> +#include <linux/hyperv.h> +#include <linux/efi.h> +#include <linux/slab.h> +#include <linux/cred.h> +#include <linux/sched.h> +#include <linux/types.h> +#include <linux/scatterlist.h> +#include <rdma/ib_umem.h> +#include <rdma/ib_user_verbs.h> +#include <asm-generic/delay.h> + +#include "vmbus_rdma.h" + +/* + * We only have a single rdma device on the host; + * have a single receive buffer. + */ + + +static char hvnd_recv_buffer[PAGE_SIZE * 4]; + +static atomic_t irp_local_hdl; + +char *hvnd_get_op_name(int ioctl) +{ + switch (ioctl) { + case IOCTL_ND_PROVIDER_INIT: + return "IOCTL_ND_PROVIDER_INIT\n"; + case IOCTL_ND_PROVIDER_BIND_FILE: + return "IOCTL_ND_PROVIDER_BIND_FILE\n"; + case IOCTL_ND_ADAPTER_OPEN: + return "IOCTL_ND_ADAPTER_OPEN\n"; + + case IOCTL_ND_ADAPTER_CLOSE: + return "IOCTL_ND_ADAPTER_CLOSE\n"; + + case IOCTL_ND_ADAPTER_QUERY: + return "IOCTL_ND_ADAPTER_QUERY\n"; + + case IOCTL_ND_PD_CREATE: + return "IOCTL_ND_PD_CREATE\n"; + + case IOCTL_ND_PD_FREE: + return "IOCTL_ND_PD_FREE\n"; + + case IOCTL_ND_CQ_CREATE: + return "IOCTL_ND_CQ_CREATE\n"; + + case IOCTL_ND_CQ_FREE: + return "IOCTL_ND_CQ_FREE\n"; + case IOCTL_ND_CQ_CANCEL_IO: + return "IOCTL_ND_CQ_CANCEL_IO\n"; + case IOCTL_ND_CQ_GET_AFFINITY: + return "IOCTL_ND_CQ_GET_AAFINITY\n"; + case IOCTL_ND_CQ_MODIFY: + return "IOCTL_ND_CQ_MODIFY\n"; + + case IOCTL_ND_CQ_NOTIFY: + return "IOCTL_ND_CQ_NOTIFY\n"; + + + case IOCTL_ND_LISTENER_CREATE: + return "IOCTL_ND_LISTENER_CREATE\n"; + + case IOCTL_ND_LISTENER_FREE: + return "IOCTL_ND_LISTENER_FREE\n"; + + case IOCTL_ND_QP_FREE: + return "IOCTL_ND_QP_FREE\n"; + + case IOCTL_ND_CONNECTOR_CANCEL_IO: + return "IOCTL_ND_CONNECTOR_CANCEL_IO\n"; + + case IOCTL_ND_LISTENER_CANCEL_IO: + return "IOCTL_ND_LISTENER_CANCEL_IO\n"; + + case IOCTL_ND_LISTENER_BIND: + return "IOCTL_ND_LISTENER_BIND\n"; + + case IOCTL_ND_LISTENER_LISTEN: + return "IOCTL_ND_LISTENER_LISTEN\n"; + + case IOCTL_ND_LISTENER_GET_ADDRESS: + return "IOCTL_ND_LISTENER_GET_ADDRESS\n"; + + case IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST: + return "IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST\n"; + + + + case IOCTL_ND_CONNECTOR_CREATE: + return "IOCTL_ND_CONNECTOR_CREATE\n"; + + case IOCTL_ND_CONNECTOR_FREE: + return "IOCTL_ND_CONNECTOR_FREE\n"; + + case IOCTL_ND_CONNECTOR_BIND: + return "IOCTL_ND_CONNECTOR_BIND\n"; + + // KYS: ALERT: ASYNCH Operation + case IOCTL_ND_CONNECTOR_CONNECT: + return "IOCTL_ND_CONNECTOR_CONNECT\n"; + + case IOCTL_ND_CONNECTOR_COMPLETE_CONNECT: + return "IOCTL_ND_CONNECTOR_COMPLETE_CONNECT\n"; + + // KYS: ALERT: ASYNCH Operation + case IOCTL_ND_CONNECTOR_ACCEPT: + return "IOCTL_ND_CONNECTOR_ACCEPT\n"; + + case IOCTL_ND_CONNECTOR_REJECT: + return "IOCTL_ND_CONNECTOR_REJECT\n"; + + case IOCTL_ND_CONNECTOR_GET_READ_LIMITS: + return "IOCTL_ND_CONNECTOR_GET_READ_LIMITS\n"; + + case IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA: + return "IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA\n"; + + case IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS: + return "IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS\n"; + + case IOCTL_ND_CONNECTOR_GET_ADDRESS: + return "IOCTL_ND_CONNECTOR_GET_ADDRESS\n"; + + // KYS: ALERT: ASYNCH Operation + case IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT: + return "IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT\n"; + + // KYS: ALERT: ASYNCH Operation + case IOCTL_ND_CONNECTOR_DISCONNECT: + return "IOCTL_ND_CONNECTOR_DISCONNECT\n"; + + + + case IOCTL_ND_QP_CREATE: + return "IOCTL_ND_QP_CREATE\n"; + + case IOCTL_ND_MR_CREATE: + return "IOCTL_ND_MR_CREATE\n"; + + case IOCTL_ND_MR_FREE: + return "IOCTL_ND_MR_FREE\n"; + case IOCTL_ND_MR_REGISTER: + return "IOCTL_ND_MR_REGISTER\n"; + case IOCTL_ND_MR_DEREGISTER: + return "IOCTL_ND_MR_DEREGISTER\n"; + case IOCTL_ND_MR_CANCEL_IO: + return "IOCTL_ND_MR_CANCEL_IO\n"; + case IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST: + return "IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST\n"; + case IOCTL_ND_QP_FLUSH: + return "IOCTL_ND_QP_FLUSH\n"; + + default: + return "Unknown IOCTL\n"; + } +} +int get_irp_handle(struct hvnd_dev *nd_dev, u32 *local, void *irp_ctx) +{ + unsigned int local_handle; + int ret; + + local_handle = atomic_inc_return(&irp_local_hdl); + *local = local_handle; + + /* + * Now asssociate the local handle with the pointer. + */ + ret = insert_handle(nd_dev, &nd_dev->irpidr, irp_ctx, local_handle); + hvnd_debug("irp_ctx=%p local_handle=%u\n", irp_ctx, local_handle); + + if (ret) { + hvnd_error("insert_handle failed ret=%d\n", ret); + return ret; + } + + return 0; +} + +void put_irp_handle(struct hvnd_dev *nd_dev, u32 irp) +{ + remove_handle(nd_dev, &nd_dev->irpidr, irp); + +} + +static void init_pfn(u64 *pfn, void *addr, u32 length) +{ + int i; + u32 offset = offset_in_page(addr); + u32 num_pfn = DIV_ROUND_UP(offset + length, PAGE_SIZE); + + for (i = 0; i < num_pfn; i++) + pfn[i] = virt_to_phys((u8 *)addr + + (PAGE_SIZE * i)) >> PAGE_SHIFT; +} + + +static void user_va_init_pfn(u64 *pfn, struct ib_umem *umem) +{ + int entry; + struct scatterlist *sg; + int i = 0; + + for_each_sg(umem->sg_head.sgl, sg, umem->nmap, entry) { + pfn[i++] = page_to_pfn(sg_page(sg)); + } +} + +static u32 get_local_handle(void *p) +{ + u64 val = (unsigned long)p; + + return (u32)val; +} + +static int hvnd_send_pg_buffer(struct hvnd_dev *nd_dev, + struct vmbus_packet_mpb_array *desc, + u32 desc_size, + void *buffer, + u32 bufferlen, u64 cookie) +{ + int ret; + int t; + struct hvnd_cookie hvnd_cookie; + + hvnd_cookie.pkt = (void *)cookie; + init_completion(&hvnd_cookie.host_event); + + ret = vmbus_sendpacket_mpb_desc(nd_dev->hvdev->channel, + desc, + desc_size, + buffer, bufferlen, + (u64)(&hvnd_cookie)); + + if (ret) { + hvnd_error("vmbus_sendpacket_mpb_desc failed ret=%d\n", ret); + goto err; + } + + t = wait_for_completion_timeout(&hvnd_cookie.host_event, 500*HZ); + + if (t == 0) { + hvnd_error("wait_for_completion_timeout timed out\n"); + ret = -ETIMEDOUT; + } + +err: + return ret; +} + +static int hvnd_send_packet(struct hvnd_dev *nd_dev, void *buffer, + u32 bufferlen, u64 cookie, bool block) +{ + int ret; + int t; + struct hvnd_cookie hvnd_cookie; + + hvnd_cookie.pkt = (void *)cookie; + init_completion(&hvnd_cookie.host_event); + + ret = vmbus_sendpacket(nd_dev->hvdev->channel, buffer, bufferlen, + (u64)(&hvnd_cookie), VM_PKT_DATA_INBAND, + VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED); + + if (ret) { + hvnd_error("vmbus_send pkt failed: %d\n", ret); + goto err; + } + + if (!block) + return ret; + + t = wait_for_completion_timeout(&hvnd_cookie.host_event, 500*HZ); + + if (t == 0) { + hvnd_error("wait_for_completion_timeout timed out\n"); + ret = -ETIMEDOUT; + } + +err: + return ret; +} + +static int hvnd_send_pgbuf_ioctl_pkt(struct hvnd_dev *nd_dev, + struct vmbus_packet_mpb_array *desc, + u32 desc_size, + struct ndv_packet_hdr_control_1 *hdr, + u32 pkt_size, u64 cookie) +{ + int ret; + int ioctl; + + ioctl = hdr->io_cntrl_code; + + + ret = hvnd_send_pg_buffer(nd_dev, desc, desc_size, + hdr, pkt_size, cookie); + + if (ret) + return ret; + + if (hdr->pkt_hdr.status != 0) { + hvnd_error("IOCTL: %s failed; status is %x\n", + hvnd_get_op_name(ioctl), + hdr->pkt_hdr.status); + return -EINVAL; + } + + switch (hdr->io_status) { + case STATUS_SUCCESS: + case STATUS_PENDING: + return 0; + + default: + hvnd_error("IOCTL: %s failed io status is %x\n", + hvnd_get_op_name(ioctl), hdr->io_status); + return -EINVAL; + } +} + +int hvnd_send_ioctl_pkt(struct hvnd_dev *nd_dev, + struct ndv_packet_hdr_control_1 *hdr, + u32 pkt_size, u64 cookie) +{ + int ret; + int ioctl; + bool block; + + block = (hdr->irp_handle.val64 == 0) ? true : false; + + + ioctl = hdr->io_cntrl_code; + + ret = hvnd_send_packet(nd_dev, hdr, pkt_size, cookie, block); + + if (ret) + return ret; + + if (!block) + return ret; + + if (hdr->pkt_hdr.status != 0) { + hvnd_error("IOCTL: %s failed; status is %x\n", + hvnd_get_op_name(ioctl), hdr->pkt_hdr.status); + return -EINVAL; + } + + switch (hdr->io_status) { + case STATUS_SUCCESS: + case STATUS_PENDING: + return 0; + + default: + hvnd_warn("IOCTL: %s failed io status is %x\n", + hvnd_get_op_name(ioctl), hdr->io_status); + return -EINVAL; + } +} + +void hvnd_init_hdr(struct ndv_packet_hdr_control_1 *hdr, + u32 data_sz, u32 local, u32 remote, + u32 ioctl_code, + u32 ext_data_sz, u32 ext_data_offset, + u64 irp_handle) + +{ + int pkt_type; + + pkt_type = NDV_PKT_ID1_CONTROL; + NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); + hdr->pkt_hdr.packet_type = pkt_type; + hdr->pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); + hdr->pkt_hdr.data_sz = data_sz; + + hdr->pkt_hdr.status = 0; + + hdr->file_handle.local = local; + hdr->file_handle.remote = remote; + hdr->irp_handle.val64 = irp_handle; + + hdr->io_cntrl_code = ioctl_code; + hdr->output_buf_sz = data_sz - ext_data_sz; + hdr->input_buf_sz = data_sz - ext_data_sz; + + hdr->input_output_buf_offset = 0; + + hdr->extended_data.size = ext_data_sz; + hdr->extended_data.offset = ext_data_offset; +} + + +int hvnd_create_file(struct hvnd_dev *nd_dev, void *uctx, + struct ndv_pkt_hdr_create_1 *create, u32 file_flags) +{ + int ret; + int pkt_type; + + + pkt_type = NDV_PKT_ID1_CREATE; + NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); + create->pkt_hdr.packet_type = pkt_type; + create->pkt_hdr.hdr_sz = sizeof(struct ndv_pkt_hdr_create_1); + create->pkt_hdr.data_sz = 0; + + create->handle.local = get_local_handle(uctx); + create->access_mask = STANDARD_RIGHTS_ALL; + create->open_options = OPEN_EXISTING; + create->file_attributes = FILE_ATTRIBUTE_NORMAL | file_flags; + create->share_access = FILE_SHARE_ALL; + + ret = hvnd_send_packet(nd_dev, create, + sizeof(struct ndv_pkt_hdr_create_1), + (unsigned long)create, true); + return ret; +} + +int hvnd_cleanup_file(struct hvnd_dev *nd_dev, u32 local, u32 remote) +{ + int ret; + int pkt_type; + struct ndv_pkt_hdr_cleanup_1 cleanup_pkt; + + + pkt_type = NDV_PKT_ID1_CLEANUP; + NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); + + cleanup_pkt.pkt_hdr.packet_type = pkt_type; + cleanup_pkt.pkt_hdr.hdr_sz = sizeof(struct ndv_pkt_hdr_create_1); + cleanup_pkt.pkt_hdr.data_sz = 0; + + cleanup_pkt.handle.local = local; + cleanup_pkt.handle.remote = remote; + + ret = hvnd_send_packet(nd_dev, &cleanup_pkt, + sizeof(struct ndv_pkt_hdr_create_1), + (unsigned long)&cleanup_pkt, true); + return ret; +} + + +static int hvnd_do_ioctl(struct hvnd_dev *nd_dev, u32 ioctl, + struct pkt_nd_provider_ioctl *pkt, + union ndv_context_handle *hdr_handle, + struct nd_handle *ioctl_handle, + u8 *buf, u32 buf_len, bool c_in, bool c_out, u64 irp_val) +{ + int ret; + int pkt_type; + + pkt_type = NDV_PKT_ID1_CONTROL; + NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); + + pkt->hdr.pkt_hdr.packet_type = pkt_type; + pkt->hdr.pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); + pkt->hdr.pkt_hdr.data_sz = (sizeof(struct pkt_nd_provider_ioctl) - + sizeof(struct ndv_packet_hdr_control_1)); + + pkt->hdr.file_handle.local = hdr_handle->local; + pkt->hdr.file_handle.remote = hdr_handle->remote; + hvnd_debug("create handle local: %x remote: %x\n", + hdr_handle->local, hdr_handle->remote); + + pkt->hdr.irp_handle.val64 = irp_val; + + pkt->hdr.io_cntrl_code = ioctl; + pkt->hdr.output_buf_sz = sizeof(struct nd_ioctl); + pkt->hdr.input_buf_sz = sizeof(struct nd_ioctl); + pkt->hdr.input_output_buf_offset = 0; + memset(&pkt->ioctl.handle, 0, sizeof(struct nd_handle)); + pkt->ioctl.handle.version = ND_VERSION_1; + + switch (ioctl) { + case IOCTL_ND_PROVIDER_BIND_FILE: + pkt->ioctl.handle.handle = ioctl_handle->handle; + break; + default: + break; + }; + + /* + * Copy the input buffer, if needed. + */ + + if (c_in && (buf != NULL)) + memcpy(pkt->ioctl.raw_buffer, buf, buf_len); + + ret = hvnd_send_packet(nd_dev, pkt, + sizeof(struct pkt_nd_provider_ioctl), + (unsigned long)pkt, true); + + if (ret) + return ret; + + if (c_out && (buf != NULL)) + memcpy(buf, pkt->ioctl.raw_buffer, buf_len); + + return ret; +} + +static int idr_callback(int id, void *p, void *data) +{ + if (p == data) + return id; + return 0; +} + +void remove_uctx(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx) +{ + int pid = current_pid(); + unsigned long flags; + int id; + + if (get_uctx(nd_dev, pid) == uctx) + remove_handle(nd_dev, &nd_dev->uctxidr, pid); + else { + hvnd_warn("uctx %p not found on pid %d, doing a idr search\n", + uctx, current_pid()); + + spin_lock_irqsave(&nd_dev->id_lock, flags); + id = idr_for_each(&nd_dev->uctxidr, idr_callback, uctx); + spin_unlock_irqrestore(&nd_dev->id_lock, flags); + + if (id) + remove_handle(nd_dev, &nd_dev->uctxidr, id); + else { + hvnd_error("uctx %p not found in idr table\n", uctx); + return; + } + } + + kfree(uctx); +} + +int hvnd_close_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx) +{ + int ret; + + /* + * First close the adaptor. + */ + + ret = hvnd_free_handle(nd_dev, uctx, + uctx->adaptor_hdl, + IOCTL_ND_ADAPTER_CLOSE); + + if (ret) + hvnd_error("Adaptor close failed; ret is %x\n", ret); + + /* + * Now close the two files we created. + */ + + ret = hvnd_cleanup_file(nd_dev, uctx->file_handle_ovl.local, + uctx->file_handle_ovl.remote); + + if (ret) + hvnd_error("file cleanup failed; ret is %x\n", ret); + + ret = hvnd_cleanup_file(nd_dev, uctx->file_handle.local, + uctx->file_handle.remote); + + if (ret) + hvnd_error("File cleanup failed; ret is %x\n", ret); + + /* + * Remove the uctx from the ID table. + */ + remove_uctx(nd_dev, uctx); + + return 0; +} + +int hvnd_open_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx) +{ + int ret; + struct pkt_nd_provider_ioctl *pr_init = &uctx->pr_init_pkt; + int pkt_type; + struct nd_handle ioctl_handle; + struct pkt_nd_open_adapter *pr_o_adap = &uctx->o_adap_pkt; + + ret = hvnd_create_file(nd_dev, uctx, &uctx->create_pkt, 0); + if (ret) { + hvnd_error("hvnd_create_file failed ret=%d\n", ret); + goto error_cr; + } + + if (uctx->create_pkt.pkt_hdr.status != 0) { + hvnd_error("create File failed; status is %d\n", + uctx->create_pkt.pkt_hdr.status); + ret = -EINVAL; + goto error_cr; + } + + uctx->file_handle.local = uctx->create_pkt.handle.local; + uctx->file_handle.remote = uctx->create_pkt.handle.remote; + hvnd_debug("INITIALIZE PROVIDER\n"); + /* + * Now Initialize the Provider. + */ + ioctl_handle.handle = 0; + ret = hvnd_do_ioctl(nd_dev, IOCTL_ND_PROVIDER_INIT, pr_init, + &uctx->create_pkt.handle, + &ioctl_handle, NULL, 0, false, false, 0); + + if (ret) { + ret = -EINVAL; + goto error_pr_init; + } + + if (pr_init->hdr.pkt_hdr.status != 0) { + hvnd_error("Provider INIT failed; status is %d\n", + pr_init->hdr.pkt_hdr.status); + ret = -EINVAL; + goto error_pr_init; + } + + if (pr_init->hdr.io_status != 0) { + hvnd_error("Provider INIT failed; io status is %d\n", + pr_init->hdr.io_status); + ret = -EINVAL; + goto error_pr_init; + } + + /* + * Now create the overlap file. + */ + + hvnd_debug("CREATE OVERLAP FILE\n"); + ret = hvnd_create_file(nd_dev, uctx, &uctx->create_pkt_ovl, + FILE_FLAG_OVERLAPPED); + if (ret) { + hvnd_error("hvnd_create_file failed ret=%d\n", ret); + goto error_pr_init; + } + + if (uctx->create_pkt_ovl.pkt_hdr.status != 0) { + hvnd_error("create Overlap File failed; status is %d\n", + uctx->create_pkt_ovl.pkt_hdr.status); + ret = -EINVAL; + goto error_pr_init; + } + uctx->file_handle_ovl.local = uctx->create_pkt_ovl.handle.local; + uctx->file_handle_ovl.remote = uctx->create_pkt_ovl.handle.remote; + + /* + * Now bind the two file handles together. + */ + + hvnd_debug("BIND FILE IOCTL remote handle: %d local handle: %d\n", + uctx->create_pkt_ovl.handle.remote, + uctx->create_pkt_ovl.handle.local); + + ioctl_handle.handle = uctx->create_pkt_ovl.handle.val64; + ret = hvnd_do_ioctl(nd_dev, IOCTL_ND_PROVIDER_BIND_FILE, pr_init, + &uctx->create_pkt.handle, + &ioctl_handle, NULL, 0, false, false, 0); + + if (ret) { + ret = -EINVAL; + goto error_file_bind; + } + if (pr_init->hdr.pkt_hdr.status != 0) { + hvnd_error("Provider File bind failed; status is %d\n", + pr_init->hdr.pkt_hdr.status); + ret = -EINVAL; + goto error_file_bind; + } + if (pr_init->hdr.io_status != 0) { + hvnd_error("Provider INIT failed; io status is %d\n", + pr_init->hdr.io_status); + ret = -EINVAL; + goto error_file_bind; + } + + /* + * Now open the adaptor. + */ + + hvnd_debug("OPENING THE ADAPTOR\n"); + + pkt_type = NDV_PKT_ID1_CONTROL; + NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); + pr_o_adap->hdr.pkt_hdr.packet_type = pkt_type; + pr_o_adap->hdr.pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); + pr_o_adap->hdr.pkt_hdr.data_sz = sizeof(struct pkt_nd_open_adapter) - + sizeof(struct ndv_packet_hdr_control_1); + + pr_o_adap->hdr.pkt_hdr.status = 0; + + hvnd_debug("hdr sz is %d\n", pr_o_adap->hdr.pkt_hdr.hdr_sz); + hvnd_debug("data sz is %d\n", pr_o_adap->hdr.pkt_hdr.data_sz); + + pr_o_adap->hdr.file_handle.local = uctx->create_pkt.handle.local; + pr_o_adap->hdr.file_handle.remote = uctx->create_pkt.handle.remote; + hvnd_debug("create handle local is %x\n", + uctx->create_pkt.handle.local); + hvnd_debug("create handle remote is %x\n", + uctx->create_pkt.handle.remote); + pr_o_adap->hdr.irp_handle.val64 = 0; + + pr_o_adap->hdr.io_cntrl_code = IOCTL_ND_ADAPTER_OPEN; + pr_o_adap->hdr.output_buf_sz = pr_o_adap->hdr.pkt_hdr.data_sz - + sizeof(struct extended_data_oad); + pr_o_adap->hdr.input_buf_sz = pr_o_adap->hdr.pkt_hdr.data_sz - + sizeof(struct extended_data_oad); + + hvnd_debug("output buf sz is %d\n", pr_o_adap->hdr.output_buf_sz); + hvnd_debug("input buf sz is %d\n", pr_o_adap->hdr.input_buf_sz); + hvnd_debug("packet size is %d\n", + (int)sizeof(struct pkt_nd_open_adapter)); + + pr_o_adap->hdr.input_output_buf_offset = 0; + + + pr_o_adap->hdr.extended_data.size = sizeof(struct extended_data_oad); + pr_o_adap->hdr.extended_data.offset = + offsetof(struct pkt_nd_open_adapter, ext_data) - + sizeof(struct ndv_packet_hdr_control_1); + + hvnd_debug("size of the extended data size: %d\n", + (int)sizeof(struct extended_data_oad)); + hvnd_debug("offset of extended data: %d\n", + pr_o_adap->hdr.extended_data.offset); + + /* + * Now fill out the ioctl section. + */ + + pr_o_adap->ioctl.input.version = ND_VERSION_1; + pr_o_adap->ioctl.input.ce_mapping_cnt = + RTL_NUMBER_OF(pr_o_adap->mappings.ctx_input.mappings); + + hvnd_debug("ce_mapping cnt is %d\n", + pr_o_adap->ioctl.input.ce_mapping_cnt); + + pr_o_adap->ioctl.input.cb_mapping_offset = sizeof(union oad_ioctl); + hvnd_debug("cb_mapping offset is %d\n", + pr_o_adap->ioctl.input.cb_mapping_offset); + pr_o_adap->ioctl.input.adapter_id = (u64)nd_dev; + + pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_UAR].map_type = ND_MAP_IOSPACE; + pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_UAR].map_io_space.cache_type = ND_NON_CACHED; + pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_UAR].map_io_space.cb_length = 4096; + + pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_BF].map_type = ND_MAP_IOSPACE; + pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_BF].map_io_space.cache_type = ND_WRITE_COMBINED; + pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_BF].map_io_space.cb_length = 4096; + + /* + * Fill in the extended data. + */ + pr_o_adap->ext_data.cnt = IBV_GET_CONTEXT_MAPPING_MAX; + + ret = hvnd_send_packet(nd_dev, pr_o_adap, + sizeof(struct pkt_nd_open_adapter), + (unsigned long)pr_o_adap, true); + if (ret) { + ret = -EINVAL; + goto error_file_bind; + } + + if (pr_o_adap->hdr.pkt_hdr.status != 0) { + hvnd_error("Open adaptor failed; status is %d\n", + pr_o_adap->hdr.pkt_hdr.status); + ret = -EINVAL; + goto error_file_bind; + } + + if (pr_o_adap->hdr.io_status != 0) { + hvnd_error("Open adaptor failed;io status is %d\n", + pr_o_adap->hdr.io_status); + ret = -EINVAL; + goto error_file_bind; + } + + /* + * Copy the necessary response from the host. + */ + + uctx->adaptor_hdl = pr_o_adap->ioctl.resrc_desc.handle; + + + hvnd_debug("adaptor handle: %p\n", (void *)uctx->adaptor_hdl); + + uctx->uar_base = + pr_o_adap->mappings.ctx_output.mapping_results[IBV_GET_CONTEXT_UAR].info; + hvnd_debug("uar base: %p\n", (void *)uctx->uar_base); + + uctx->bf_base = + pr_o_adap->mappings.ctx_output.mapping_results[IBV_GET_CONTEXT_BF].info; + hvnd_debug("bf base: %p\n", (void *)uctx->bf_base); + + uctx->bf_buf_size = + pr_o_adap->mappings.ctx_output.bf_buf_size; + hvnd_debug("bf buf size: %d\n", uctx->bf_buf_size); + + uctx->bf_offset = + pr_o_adap->mappings.ctx_output.bf_offset; + hvnd_debug("bf offset: %d\n", uctx->bf_offset); + + uctx->cqe_size = + pr_o_adap->mappings.ctx_output.cqe_size; + hvnd_debug("cqe size: %d\n", uctx->cqe_size); + + uctx->max_qp_wr = + pr_o_adap->mappings.ctx_output.max_qp_wr; + hvnd_debug("max qp wr: %d\n", uctx->max_qp_wr); + + uctx->max_sge = + pr_o_adap->mappings.ctx_output.max_sge; + hvnd_debug("max sge: %d\n", uctx->max_sge); + + uctx->max_cqe = + pr_o_adap->mappings.ctx_output.max_cqe; + hvnd_debug("max cqe: %d\n", uctx->max_cqe); + + uctx->num_qps = + pr_o_adap->mappings.ctx_output.qp_tab_size; + hvnd_debug("num qps: %d\n", uctx->num_qps); + + /* + * Now query the adaptor and stash away the adaptor info. + */ + + ret = hvnd_query_adaptor(nd_dev, uctx); + if (ret) { + hvnd_error("Query Adaptor failed; ret is %d\n", ret); + goto query_err; + } + + return ret; + +query_err: + hvnd_free_handle(nd_dev, uctx, + uctx->adaptor_hdl, + IOCTL_ND_ADAPTER_CLOSE); + + hvnd_error("Open Adaptor Failed!!\n"); + +error_file_bind: + hvnd_cleanup_file(nd_dev, uctx->file_handle_ovl.local, + uctx->file_handle_ovl.remote); + +error_pr_init: + hvnd_cleanup_file(nd_dev, uctx->file_handle.local, + uctx->file_handle.remote); + +error_cr: + if (get_uctx(nd_dev, current_pid()) != NULL) + remove_handle(nd_dev, &nd_dev->uctxidr, current_pid()); + + return ret; +} + +int hvnd_create_cq(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + struct hvnd_cq *cq) +{ + int ret; + struct pkt_nd_create_cq *pkt; + int num_pfn, num_db_pfn, num_sn_pfn; + int cq_pkt_size; + unsigned int cq_buf_size, offset; + u32 ext_data_sz; + u32 ext_data_offset; + + /* + * Now create CQ. + * First compute the number of PFNs we need to accomodate: + * One each for door bell and arm_sn and pages in cq buffer. + */ + cq_buf_size = (cq->entries * uctx->cqe_size); + offset = offset_in_page(cq->cq_buf); + num_pfn = DIV_ROUND_UP(offset + cq_buf_size, PAGE_SIZE); + + offset = offset_in_page(cq->db_addr); + num_db_pfn = DIV_ROUND_UP(offset + 8, PAGE_SIZE); + + offset = offset_in_page(&cq->arm_sn); + num_sn_pfn = DIV_ROUND_UP(offset + 4, PAGE_SIZE); + + cq_pkt_size = sizeof(struct pkt_nd_create_cq) + + (num_pfn * sizeof(u64)); + + ext_data_sz = sizeof(struct create_cq_ext_data) + + (num_pfn * sizeof(u64)); + ext_data_offset = offsetof(struct pkt_nd_create_cq, ext_data) - + sizeof(struct ndv_packet_hdr_control_1); + + hvnd_debug("CREATE CQ, num user addr pfns is %d\n", num_pfn); + hvnd_debug("CREATE CQ, num db pfns is %d\n", num_db_pfn); + + pkt = kzalloc(cq_pkt_size, GFP_KERNEL); + + if (!pkt) + return -ENOMEM; + + hvnd_init_hdr(&pkt->hdr, + (cq_pkt_size - + sizeof(struct ndv_packet_hdr_control_1)), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_CQ_CREATE, + ext_data_sz, + ext_data_offset, + 0); + + /* + * Now fill out the ioctl section. + */ + + pkt->ioctl.input.version = ND_VERSION_1; + pkt->ioctl.input.queue_depth = cq->entries; + pkt->ioctl.input.ce_mapping_cnt = MLX4_IB_CREATE_CQ_MAPPING_MAX; + pkt->ioctl.input.cb_mapping_offset = sizeof(union create_cq_ioctl); + + hvnd_debug("ce_mapping cnt is %d\n", pkt->ioctl.input.ce_mapping_cnt); + hvnd_debug("cb_mapping offset is %d\n", + pkt->ioctl.input.cb_mapping_offset); + + pkt->ioctl.input.adapter_handle = uctx->adaptor_hdl; + pkt->ioctl.input.affinity.mask = 0; + pkt->ioctl.input.affinity.group = -1; + + // 0 for usermode CQ arming + pkt->mappings.cq_in.flags = 0; + + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_BUF].map_memory.map_type = ND_MAP_MEMORY; + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_BUF].map_memory.access_type = ND_MODIFY_ACCESS; + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_BUF].map_memory.address = (u64)cq->cq_buf; + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_BUF].map_memory.cb_length = (cq->entries * uctx->cqe_size); + + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_DB].map_memory.map_type = ND_MAP_MEMORY_COALLESCE; + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_DB].map_memory.access_type = ND_WRITE_ACCESS; + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_DB].map_memory.address = (u64)cq->db_addr; + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_DB].map_memory.cb_length = 8; //size of two ints + + + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_ARM_SN].map_memory.map_type = ND_MAP_MEMORY; + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_ARM_SN].map_memory.access_type = ND_MODIFY_ACCESS; + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_ARM_SN].map_memory.address = (u64)&cq->arm_sn; + pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_ARM_SN].map_memory.cb_length = 4; //size of one int + /* + * Fill in the extended data. + */ + + pkt->ext_data.cnt = 3; + pkt->ext_data.fields[MLX4_IB_CREATE_CQ_BUF].size = (sizeof(struct gpa_range) + (num_pfn * sizeof(u64))); + pkt->ext_data.fields[MLX4_IB_CREATE_CQ_BUF].offset = offsetof(struct create_cq_ext_data, cqbuf_gpa); + + pkt->ext_data.fields[MLX4_IB_CREATE_CQ_DB].size = sizeof(struct cq_db_gpa); + pkt->ext_data.fields[MLX4_IB_CREATE_CQ_DB].offset = offsetof(struct create_cq_ext_data, db_gpa); + + pkt->ext_data.fields[MLX4_IB_CREATE_CQ_ARM_SN].size = sizeof(struct cq_db_gpa); + pkt->ext_data.fields[MLX4_IB_CREATE_CQ_ARM_SN].offset = offsetof(struct create_cq_ext_data, sn_gpa); + + /* + * Fill up the gpa range for cq buffer. + */ + + pkt->ext_data.db_gpa.byte_count = 8; + pkt->ext_data.db_gpa.byte_offset = offset_in_page(cq->db_addr); + user_va_init_pfn(&pkt->ext_data.db_gpa.pfn_array[0], cq->db_umem); + + pkt->ext_data.sn_gpa.byte_count = 4; + pkt->ext_data.sn_gpa.byte_offset = offset_in_page(&cq->arm_sn); + init_pfn(&pkt->ext_data.sn_gpa.pfn_array[0], + &cq->arm_sn, + 4); + + pkt->ext_data.cqbuf_gpa.byte_count = (cq->entries * uctx->cqe_size); + pkt->ext_data.cqbuf_gpa.byte_offset = offset_in_page(cq->cq_buf); + user_va_init_pfn(&pkt->ext_data.cqbuf_gpa.pfn_array[0], cq->umem); + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt->hdr, cq_pkt_size, (u64)pkt); + + if (ret) + goto cr_cq_err; + + /* + * Copy the necessary response from the host. + */ + cq->cqn = pkt->mappings.cq_resp.cqn; + cq->cqe = pkt->mappings.cq_resp.cqe; + cq->cq_handle = pkt->ioctl.resrc_desc.handle; + + ret = insert_handle(nd_dev, &nd_dev->cqidr, cq, cq->cqn); + + if (ret) + goto cr_cq_err; + hvnd_debug("CQ create after success cqn is %d\n", cq->cqn); + hvnd_debug("CQ create after success cqe is %d\n", cq->cqe); + hvnd_debug("CQ create after success cq handle is %p\n", + (void *)cq->cq_handle); + +cr_cq_err: + kfree(pkt); + return ret; +} + +int hvnd_destroy_cq(struct hvnd_dev *nd_dev, struct hvnd_cq *cq) +{ + struct pkt_nd_free_cq free_cq_pkt; + int ret; + + // KYS try to avoid having to zero everything + memset(&free_cq_pkt, 0, sizeof(free_cq_pkt)); + hvnd_init_hdr(&free_cq_pkt.hdr, + sizeof(struct pkt_nd_free_cq) - + sizeof(struct ndv_packet_hdr_control_1), + cq->uctx->create_pkt.handle.local, + cq->uctx->create_pkt.handle.remote, + IOCTL_ND_CQ_FREE, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + free_cq_pkt.ioctl.in.version = ND_VERSION_1; + free_cq_pkt.ioctl.in.handle = cq->cq_handle; + + ret = hvnd_send_ioctl_pkt(nd_dev, &free_cq_pkt.hdr, + sizeof(struct pkt_nd_free_cq), + (u64)&free_cq_pkt); + + if (ret) + goto free_cq_err; + + remove_handle(nd_dev, &nd_dev->cqidr, cq->cqn); + + return 0; + +free_cq_err: + return ret; +} + + +int hvnd_notify_cq(struct hvnd_dev *nd_dev, struct hvnd_cq *cq, + u32 notify_type, u64 irp_handle) +{ + struct pkt_nd_notify_cq notify_cq_pkt; + int ret; + union ndv_context_handle irp_fhandle; + + irp_fhandle.local = cq->ep_object.local_irp; + + // KYS try to avoid having to zero everything + memset(¬ify_cq_pkt, 0, sizeof(notify_cq_pkt)); + hvnd_init_hdr(¬ify_cq_pkt.hdr, + sizeof(struct pkt_nd_notify_cq) - + sizeof(struct ndv_packet_hdr_control_1), + cq->uctx->create_pkt.handle.local, + cq->uctx->create_pkt.handle.remote, + IOCTL_ND_CQ_NOTIFY, 0, 0, irp_fhandle.val64); + + /* + * Now fill in the ioctl section. + */ + notify_cq_pkt.ioctl.in.version = ND_VERSION_1; + notify_cq_pkt.ioctl.in.cq_handle = cq->cq_handle; + notify_cq_pkt.ioctl.in.type = notify_type; + + + ret = hvnd_send_ioctl_pkt(nd_dev, ¬ify_cq_pkt.hdr, + sizeof(struct pkt_nd_notify_cq), + (u64)¬ify_cq_pkt); + + return ret; +} + +/* + * Memory region operations. + */ +int hvnd_cr_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 pd_handle, u64 *mr_handle) +{ + struct pkt_nd_create_mr pkt; + int ret; + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_MR_CREATE, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.version = ND_VERSION_1; + pkt.ioctl.in.handle = pd_handle; + hvnd_debug("PD handle is %p\n", (void *)pd_handle); + pkt.ioctl.in.reserved = 0; + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + /* + * Copy the handle. + */ + hvnd_debug("mr handle is %p\n", (void *)pkt.ioctl.out); + *mr_handle = pkt.ioctl.out; + + return 0; + +err: + hvnd_error("create mr failed: %d\n", ret); + return ret; + +} + +int hvnd_free_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 handle) +{ + return hvnd_free_handle(nd_dev, uctx, handle, IOCTL_ND_MR_FREE); +} + +int hvnd_deregister_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 handle) +{ + struct pkt_nd_deregister_mr pkt; + int ret; + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_MR_DEREGISTER, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.version = ND_VERSION_1; + pkt.ioctl.in.handle = handle; + pkt.ioctl.in.reserved = 0; + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + return 0; + +err: + hvnd_error("de-register mr failed: %d\n", ret); + return ret; + +} + +static inline u32 hvnd_convert_access(int acc) +{ + return (acc & IB_ACCESS_REMOTE_WRITE ? ND_MR_FLAG_ALLOW_REMOTE_WRITE : 0) | + (acc & IB_ACCESS_REMOTE_READ ? ND_MR_FLAG_ALLOW_REMOTE_READ : 0) | + (acc & IB_ACCESS_LOCAL_WRITE ? ND_MR_FLAG_ALLOW_LOCAL_WRITE : 0); +} + + +int hvnd_mr_register(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + struct hvnd_mr *mr) +{ + struct pkt_nd_register_mr pkt; + int ret; + struct hv_mpb_array *pb; + struct vmbus_packet_mpb_array *tpb; + int sz_leaf; + int num_pgs; + int i = 0; + int ext_data_sz; + u32 acc_flags; + u32 desc_size; + int pkt_type; + + /* + * The user address is passed in via a two level structure. + * An Array of struct hv_page_buffer will be used to describe + * the user memory. The pages containing this array will be descibed + * in another array of struct hv_page_buffer. We pass this seconed level + * array to the host. + */ + + hvnd_debug("ib_umem_page_count(mr->umem)=%d\n", + ib_umem_page_count(mr->umem)); + + sz_leaf = ib_umem_page_count(mr->umem) * sizeof(u64) + + sizeof(struct hv_mpb_array); + + pb = (struct hv_mpb_array *) __get_free_pages(GFP_KERNEL|__GFP_ZERO, + get_order(sz_leaf)); + + if (pb == NULL) + return -ENOMEM; + /* + * Allocate an array of hv_page_buffer to describe the first level. + */ + num_pgs = DIV_ROUND_UP(sz_leaf, PAGE_SIZE); + hvnd_debug("num pages in the top array is %d\n", num_pgs); + + desc_size = (num_pgs * sizeof(u64) + + sizeof(struct vmbus_packet_mpb_array)); + tpb = (struct vmbus_packet_mpb_array *) + __get_free_pages(GFP_KERNEL|__GFP_ZERO, get_order(desc_size)); + + if (tpb == NULL) { + free_pages((unsigned long)pb, get_order(sz_leaf)); + return -ENOMEM; + } + + hvnd_debug("sz leaf: %d; pgs in top %d\n", sz_leaf, num_pgs); + + /* + * Now fill the leaf level array. + */ + pb->len = mr->length; + pb->offset = offset_in_page(mr->start); + user_va_init_pfn(pb->pfn_array, mr->umem); + + /* + * Now fill out the top level array. + */ + for (i = 0; i < num_pgs; i++) { + tpb->range.pfn_array[i] = virt_to_phys( + (u8 *)pb + (PAGE_SIZE * i)) >> PAGE_SHIFT; + hvnd_debug("virtual address = %p\n", + (u8 *)pb + (PAGE_SIZE * i)); + hvnd_debug("physical address = %llx\n", + virt_to_phys((u8 *)pb + (PAGE_SIZE * i))); + hvnd_debug("tpb->range.pfn_array[%d]=%llx\n", i, + tpb->range.pfn_array[i]); + } + + tpb->range.offset = 8; + tpb->range.len = ib_umem_page_count(mr->umem) * sizeof(u64); + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + ext_data_sz = (ib_umem_page_count(mr->umem) * sizeof(u64)); + acc_flags = ND_MR_FLAG_DO_NOT_SECURE_VM | hvnd_convert_access(mr->acc); + hvnd_debug("memory register access flags are: %x\n", acc_flags); + + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_MR_REGISTER, 0, 0, 0); + + /* + * The memory registration call uses a different mechanism to pass + * pfn information. + */ + + pkt_type = pkt.hdr.pkt_hdr.packet_type; + NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTION_EXTERNAL_DATA); + pkt.hdr.pkt_hdr.packet_type = pkt_type; + + pkt.hdr.extended_data.size = ext_data_sz; + pkt.hdr.extended_data.offset = 0; + /* + * Now fill out the ioctl. + */ + + pkt.ioctl.in.header.version = ND_VERSION_1; + pkt.ioctl.in.header.flags = acc_flags; + pkt.ioctl.in.header.cb_length = mr->length; + pkt.ioctl.in.header.target_addr = mr->virt; + pkt.ioctl.in.header.mr_handle = mr->mr_handle; + pkt.ioctl.in.address = mr->virt; + + /* + * Now send the packet to the host. + */ + + ret = hvnd_send_pgbuf_ioctl_pkt(nd_dev, + tpb, desc_size, + &pkt.hdr, + sizeof(pkt), + (unsigned long)&pkt); + + if (ret) + goto err; + + hvnd_info("MR REGISTRATION SUCCESS\n"); + /* + * Copy the mr registration data. + */ + hvnd_debug("mr registration lkey %x\n", pkt.ioctl.out.lkey); + hvnd_debug("mr registration rkey %x\n", pkt.ioctl.out.rkey); + + mr->mr_lkey = pkt.ioctl.out.lkey; + mr->mr_rkey = pkt.ioctl.out.rkey; + + mr->ibmr.lkey = mr->mr_lkey; + mr->ibmr.rkey = be32_to_cpu(mr->mr_rkey); + hvnd_debug("ibmr registration lkey %x\n", mr->ibmr.lkey); + hvnd_debug("ibmr registration rkey %x\n", mr->ibmr.rkey); + + free_pages((unsigned long)pb, get_order(sz_leaf)); + free_pages((unsigned long)tpb, get_order(desc_size)); + + return 0; + +err: + free_pages((unsigned long)pb, get_order(sz_leaf)); + free_pages((unsigned long)tpb, get_order(desc_size)); + + hvnd_error("mr register failed: %d\n", ret); + return ret; +} + +/* + * Listener operations. + */ +int hvnd_cr_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 *listener_handle) +{ + struct pkt_nd_cr_listener pkt; + int ret; + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_LISTENER_CREATE, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.hdr.version = ND_VERSION_1; + pkt.ioctl.in.hdr.handle = uctx->adaptor_hdl; + hvnd_debug("Adaptor handle is %p\n", (void *)uctx->adaptor_hdl); + pkt.ioctl.in.hdr.reserved = 0; + pkt.ioctl.in.to_semantics = false; + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + /* + * Copy the listener handle. + */ + hvnd_debug("listener handle is %p\n", (void *)pkt.ioctl.out); + *listener_handle = pkt.ioctl.out; + + return 0; + +err: + hvnd_error("create listener failed: ret=%d uctx=%p adaptor handle=%llu\n", + ret, uctx, uctx->adaptor_hdl); + return ret; + +} + +int hvnd_free_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 listener_handle) +{ + struct pkt_nd_free_listener pkt; + int ret; + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_LISTENER_FREE, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.version = ND_VERSION_1; + pkt.ioctl.in.handle = listener_handle; + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + return 0; + +err: + hvnd_error("free listener failed: %d\n", ret); + return ret; +} + +int hvnd_bind_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 listener_handle, union nd_sockaddr_inet *addr) +{ + struct pkt_nd_bind_listener pkt; + kuid_t uid = current_uid(); + int ret; + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_LISTENER_BIND, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.hdr.version = ND_VERSION_1; + pkt.ioctl.in.hdr.handle = listener_handle; + pkt.ioctl.in.hdr.reserved = 0; + + pkt.ioctl.in.authentication_id = (u32)uid.val; + pkt.ioctl.in.is_admin = false; + + memcpy(&pkt.ioctl.in.hdr.address, addr, sizeof(*addr)); + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + return 0; + +err: + hvnd_error("bind listener failed: %d\n", ret); + return ret; +} + +int hvnd_listen_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 listener_handle, u32 backlog) +{ + struct pkt_nd_listen_listener pkt; + int ret; + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_LISTENER_LISTEN, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.version = ND_VERSION_1; + pkt.ioctl.in.listener_handle = listener_handle; + pkt.ioctl.in.back_log = backlog; + + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + return 0; + +err: + hvnd_error("listen listener failed: %d\n", ret); + return ret; +} + +int hvnd_get_addr_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 listener_handle, union nd_sockaddr_inet *addr) +{ + struct pkt_nd_get_addr_listener pkt; + int ret; + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_LISTENER_GET_ADDRESS, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.version = ND_VERSION_1; + pkt.ioctl.in.handle = listener_handle; + + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + /* + * Copy the adddress. + */ + + memcpy(addr, &pkt.ioctl.out, sizeof(union nd_sockaddr_inet)); + + return 0; + +err: + hvnd_error("listen listener failed: %d\n", ret); + return ret; +} + +int hvnd_get_connection_listener(struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx, + u64 listener_handle, u64 connector_handle, + u64 irp_handle) +{ + struct pkt_nd_get_connection_listener pkt; + int ret; + union ndv_context_handle irp_fhandle; + + ret = get_irp_handle(nd_dev, &irp_fhandle.local, (void *)irp_handle); + + if (ret) { + hvnd_error("get_irp_handle() failed: err: %d\n", ret); + return ret; + } + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST, 0, 0, + irp_fhandle.val64); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.version = ND_VERSION_1; + pkt.ioctl.in.listener_handle = listener_handle; + pkt.ioctl.in.connector_handle = connector_handle; + + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + return 0; + +err: + hvnd_error("get connection listener failed: %d\n", ret); + return ret; +} + +/* + * Connector APIs. + */ + +int hvnd_cr_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 *connector_handle) +{ + struct pkt_nd_cr_connector pkt; + int ret; + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(struct pkt_nd_cr_listener) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_CONNECTOR_CREATE, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.hdr.version = ND_VERSION_1; + pkt.ioctl.in.hdr.handle = uctx->adaptor_hdl; + pkt.ioctl.in.to_semantics = false; + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + /* + * Copy the listener handle. + */ + hvnd_debug("connector handle is %p\n", (void *)pkt.ioctl.out); + *connector_handle = pkt.ioctl.out; + + return 0; + +err: + return ret; +} + +int hvnd_free_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 handle) +{ + struct pkt_nd_free_connector pkt; + int ret; + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_CONNECTOR_FREE, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.version = ND_VERSION_1; + pkt.ioctl.in.handle = handle; + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + return 0; + +err: + return ret; +} + +int hvnd_bind_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 handle, union nd_sockaddr_inet *addr) +{ + struct pkt_nd_bind_connector pkt; + int ret; + kuid_t uid = current_uid(); + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_CONNECTOR_BIND, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.hdr.version = ND_VERSION_1; + pkt.ioctl.in.hdr.handle = handle; + + memcpy(&pkt.ioctl.in.hdr.address, addr, sizeof(*addr)); + + pkt.ioctl.in.authentication_id = (u32)uid.val; + pkt.ioctl.in.is_admin = false; + + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + return 0; + +err: + return ret; +} + +int hvnd_connector_connect(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 connector_handle, u32 in_rd_limit, u32 out_rd_limit, + u32 priv_data_length, const u8 *priv_data, + u64 qp_handle, struct if_physical_addr *phys_addr, + union nd_sockaddr_inet *dest_addr, + struct hvnd_ep_obj *ep) +{ + struct pkt_nd_connector_connect *pkt = &ep->connector_connect_pkt; + int ret; + union ndv_context_handle irp_fhandle; + + hvnd_debug("local irp is %d\n", ep->local_irp); + irp_fhandle.local = ep->local_irp; + + if (priv_data_length > MAX_PRIVATE_DATA_LEN) { + hvnd_error("priv_data_length=%d\n", priv_data_length); + return -EINVAL; + } + + // KYS try to avoid having to zero everything + memset(pkt, 0, sizeof(*pkt)); + hvnd_init_hdr(&pkt->hdr, + sizeof(*pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_CONNECTOR_CONNECT, 0, 0, irp_fhandle.val64); + + /* + * Now fill in the ioctl section. + */ + pkt->ioctl.in.hdr.version = ND_VERSION_1; + pkt->ioctl.in.hdr.connector_handle = connector_handle; + pkt->ioctl.in.hdr.read_limits.inbound = in_rd_limit; + pkt->ioctl.in.hdr.read_limits.outbound = out_rd_limit; + pkt->ioctl.in.hdr.cb_private_data_length = priv_data_length; + pkt->ioctl.in.hdr.cb_private_data_offset = + offsetof(union connector_connect_ioctl, in.priv_data); + pkt->ioctl.in.hdr.qp_handle = qp_handle; + + memcpy(&pkt->ioctl.in.hdr.phys_addr, phys_addr, + sizeof(struct if_physical_addr)); + + /* + * Luke's code does not copy the ip address. + */ + memcpy(&pkt->ioctl.in.hdr.destination_address, dest_addr, + sizeof(union nd_sockaddr_inet)); + + pkt->ioctl.in.retry_cnt = 7; + pkt->ioctl.in.rnr_retry_cnt = 7; + memcpy(pkt->ioctl.in.priv_data, priv_data, priv_data_length); + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt->hdr, sizeof(*pkt), (u64)pkt); + + if (ret) + goto err; + + return 0; + +err: + return ret; +} + +int hvnd_connector_complete_connect(struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx, + u64 connector_handle, + enum ibv_qp_state *qp_state) +{ + struct pkt_nd_connector_connect_complete pkt; + int ret; + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_CONNECTOR_COMPLETE_CONNECT, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.hdr.version = ND_VERSION_1; + pkt.ioctl.in.hdr.handle = connector_handle; + pkt.ioctl.in.rnr_nak_to = 0; + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + *qp_state = pkt.ioctl.out.state; + return 0; + +err: + return ret; +} + +int hvnd_connector_accept(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 connector_handle, + u64 qp_handle, + u32 in_rd_limit, u32 out_rd_limit, + u32 priv_data_length, const u8 *priv_data, + enum ibv_qp_state *qp_state, struct hvnd_ep_obj *ep) +{ + struct pkt_nd_connector_accept pkt; + int ret; + union ndv_context_handle irp_fhandle; + + irp_fhandle.local = ep->local_irp; + + if (priv_data_length > MAX_PRIVATE_DATA_LEN) { + hvnd_error("priv_data_length=%d\n", priv_data_length); + return -EINVAL; + } + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_CONNECTOR_ACCEPT, 0, 0, irp_fhandle.val64); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.hdr.version = ND_VERSION_1; + pkt.ioctl.in.hdr.reserved = 0; + pkt.ioctl.in.hdr.read_limits.inbound = in_rd_limit; + pkt.ioctl.in.hdr.read_limits.outbound = out_rd_limit; + pkt.ioctl.in.hdr.cb_private_data_length = priv_data_length; + + pkt.ioctl.in.hdr.cb_private_data_offset = + offsetof(struct connector_accept_in, private_data); + + pkt.ioctl.in.hdr.connector_handle = connector_handle; + pkt.ioctl.in.hdr.qp_handle = qp_handle; + + pkt.ioctl.in.rnr_nak_to = 0; + pkt.ioctl.in.rnr_retry_cnt = 7; + + + memcpy(pkt.ioctl.in.private_data, priv_data, priv_data_length); + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + *qp_state = pkt.ioctl.out.state; + return 0; + +err: + return ret; +} + +int hvnd_connector_reject(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 connector_handle, + u32 priv_data_length, u8 *priv_data, + enum ibv_qp_state *qp_state) +{ + struct pkt_nd_connector_reject pkt; + int ret; + + if (priv_data_length > MAX_PRIVATE_DATA_LEN) { + hvnd_error("priv_data_length=%d\n", priv_data_length); + return -EINVAL; + } + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_CONNECTOR_REJECT, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.hdr.version = ND_VERSION_1; + pkt.ioctl.in.hdr.reserved = 0; + pkt.ioctl.in.hdr.cb_private_data_length = priv_data_length; + + pkt.ioctl.in.hdr.cb_private_data_offset = + offsetof(struct connector_reject_in, private_data); + + pkt.ioctl.in.hdr.connector_handle = connector_handle; + + memcpy(pkt.ioctl.in.private_data, priv_data, priv_data_length); + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + *qp_state = pkt.ioctl.out.state; + return 0; + +err: + return ret; +} + +int hvnd_connector_get_rd_limits(struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx, + u64 connector_handle, + struct nd_read_limits *rd_limits) +{ + struct pkt_nd_connector_get_rd_limits pkt; + int ret; + + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_CONNECTOR_GET_READ_LIMITS, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.in.version = ND_VERSION_1; + pkt.ioctl.in.in.reserved = 0; + pkt.ioctl.in.in.handle = connector_handle; + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + *rd_limits = pkt.ioctl.out.out; + return 0; + +err: + return ret; +} + +int hvnd_connector_get_priv_data(struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx, + u64 connector_handle, + u8 *priv_data) +{ + struct pkt_nd_connector_get_priv_data pkt; + int ret; + + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.version = ND_VERSION_1; + pkt.ioctl.in.reserved = 0; + pkt.ioctl.in.handle = connector_handle; + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + memcpy(priv_data, pkt.ioctl.out, MAX_PRIVATE_DATA_LEN); + return 0; + +err: + return ret; +} + +int hvnd_connector_get_peer_addr(struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx, + u64 connector_handle, + union nd_sockaddr_inet *peer_addr) +{ + struct pkt_nd_connector_get_peer_addr pkt; + int ret; + + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.version = ND_VERSION_1; + pkt.ioctl.in.reserved = 0; + pkt.ioctl.in.handle = connector_handle; + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + memcpy(peer_addr, &pkt.ioctl.out, sizeof(union nd_sockaddr_inet)); + return 0; + +err: + return ret; +} + +int hvnd_connector_get_local_addr(struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx, + u64 connector_handle, + union nd_sockaddr_inet *addr) +{ + struct pkt_nd_connector_get_addr pkt; + int ret; + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_CONNECTOR_GET_ADDRESS, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.version = ND_VERSION_1; + pkt.ioctl.in.reserved = 0; + pkt.ioctl.in.handle = connector_handle; + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + memcpy(addr, &pkt.ioctl.out, sizeof(union nd_sockaddr_inet)); + return 0; + +err: + return ret; +} + + +int hvnd_connector_notify_disconnect(struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx, + u64 connector_handle, struct hvnd_ep_obj *ep) +{ + struct pkt_nd_connector_notify_disconnect pkt; + int ret; + union ndv_context_handle irp_fhandle; + + irp_fhandle .local = ep->local_irp; + + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT, 0, 0, + irp_fhandle.val64); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.version = ND_VERSION_1; + pkt.ioctl.in.reserved = 0; + pkt.ioctl.in.handle = connector_handle; + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + return 0; + +err: + return ret; +} + + +//ASYNCH call +int hvnd_connector_disconnect(struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx, + u64 connector_handle, struct hvnd_ep_obj *ep) +{ + struct pkt_nd_connector_disconnect pkt; + int ret; + union ndv_context_handle irp_fhandle; + + irp_fhandle.local = ep->local_irp; + + + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_CONNECTOR_DISCONNECT, 0, 0, irp_fhandle.val64); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.version = ND_VERSION_1; + pkt.ioctl.in.reserved = 0; + pkt.ioctl.in.handle = connector_handle; + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + return 0; + +err: + return ret; +} + +/* + * QP operations. + */ +int hvnd_create_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + struct hvnd_qp *qp) +{ + int ret; + struct pkt_nd_create_qp *pkt; + int num_pfn, num_db_pfn; + int qp_pkt_size; + unsigned int offset; + u32 ext_data_offset; + u32 ext_data_size; + + /* + * Now create QP. + * First compute the number of PFNs we need to accomodate: + * One each for door bell and arm_sn and pages in cq buffer. + */ + offset = offset_in_page(qp->qp_buf); + num_pfn = DIV_ROUND_UP(offset + qp->buf_size, PAGE_SIZE); + + offset = offset_in_page(qp->db_addr); + num_db_pfn = DIV_ROUND_UP(offset + 4, PAGE_SIZE); + + qp_pkt_size = sizeof(struct pkt_nd_create_qp) + + (num_pfn * sizeof(u64)); + + hvnd_debug("CREATE QP, num pfns is %d\n", num_pfn); + hvnd_debug("CREATE QP, num DB pfns is %d\n", num_db_pfn); + + pkt = kzalloc(qp_pkt_size, GFP_KERNEL); + + if (!pkt) + return -ENOMEM; + + hvnd_debug("offset of nd_create_qp is %d\n", + (int)offsetof(struct pkt_nd_create_qp, ioctl.input)); + + ext_data_offset = offsetof(struct pkt_nd_create_qp, ext_data) - + sizeof(struct ndv_packet_hdr_control_1); + + ext_data_size = sizeof(struct create_qp_ext_data) + + (num_pfn * sizeof(u64)); + + hvnd_init_hdr(&pkt->hdr, + qp_pkt_size - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_QP_CREATE, + ext_data_size, + ext_data_offset, + 0); + + /* + * Now fill out the ioctl section. + */ + + pkt->ioctl.input.hdr.version = ND_VERSION_1; + + if (qp->max_inline_data > nd_dev->query_pkt.ioctl.ad_info.inline_request_threshold) + qp->max_inline_data = nd_dev->query_pkt.ioctl.ad_info.inline_request_threshold; + pkt->ioctl.input.hdr.cb_max_inline_data = qp->max_inline_data; + + hvnd_debug("pkt->ioctl.input.hdr.cb_max_inline_data=%d\n", + pkt->ioctl.input.hdr.cb_max_inline_data); + + pkt->ioctl.input.hdr.ce_mapping_cnt = MLX4_IB_CREATE_QP_MAPPINGS_MAX; + pkt->ioctl.input.hdr.cb_mapping_offset = sizeof(union create_qp_ioctl); + + pkt->ioctl.input.hdr.initiator_queue_depth = qp->initiator_q_depth; + pkt->ioctl.input.hdr.max_initiator_request_sge = + qp->initiator_request_sge; + + hvnd_debug("recv cq handle is %p\n", (void *)qp->receive_cq_handle); + hvnd_debug("send cq handle is %p\n", (void *)qp->initiator_cq_handle); + hvnd_debug("pd handle is %p\n", (void *)qp->pd_handle); + pkt->ioctl.input.hdr.receive_cq_handle = qp->receive_cq_handle; + pkt->ioctl.input.hdr.initiator_cq_handle = qp->initiator_cq_handle; + pkt->ioctl.input.hdr.pd_handle = qp->pd_handle; + + + hvnd_debug("ce_mapping cnt is %d\n", + pkt->ioctl.input.hdr.ce_mapping_cnt); + hvnd_debug("cb_mapping offset is %d\n", + pkt->ioctl.input.hdr.cb_mapping_offset); + + pkt->ioctl.input.receive_queue_depth = qp->receive_q_depth; + pkt->ioctl.input.max_receive_request_sge = qp->receive_request_sge; + + + pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_BUF].map_memory.map_type = ND_MAP_MEMORY; + pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_BUF].map_memory.access_type = ND_MODIFY_ACCESS; + pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_BUF].map_memory.address = (u64)qp->qp_buf; + pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_BUF].map_memory.cb_length = qp->buf_size; + + pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_DB].map_memory.map_type = ND_MAP_MEMORY_COALLESCE; + pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_DB].map_memory.access_type = ND_WRITE_ACCESS; + pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_DB].map_memory.address = (u64)qp->db_addr; + pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_DB].map_memory.cb_length = 4; + + pkt->mappings.qp_in.log_sq_bb_count = qp->log_sq_bb_count; + pkt->mappings.qp_in.log_sq_stride = qp->log_sq_stride; + pkt->mappings.qp_in.sq_no_prefetch = qp->sq_no_prefetch; + + + /* + * Fill in the extended data. + */ + + pkt->ext_data.cnt = 2; + pkt->ext_data.fields[MLX4_IB_CREATE_QP_BUF].size = + sizeof(struct gpa_range) + (num_pfn * sizeof(u64)); + pkt->ext_data.fields[MLX4_IB_CREATE_QP_BUF].offset = + offsetof(struct create_qp_ext_data, qpbuf_gpa); + + pkt->ext_data.fields[MLX4_IB_CREATE_QP_DB].size = + sizeof(struct qp_db_gpa); + pkt->ext_data.fields[MLX4_IB_CREATE_QP_DB].offset = + offsetof(struct create_qp_ext_data, db_gpa); + + /* + * Fill up the gpa range for qp buffer. + */ + + pkt->ext_data.db_gpa.byte_count = 4; // KYS 8 or 16? + pkt->ext_data.db_gpa.byte_offset = offset_in_page(qp->db_addr); + user_va_init_pfn(&pkt->ext_data.db_gpa.pfn_array[0], qp->db_umem); + + pkt->ext_data.qpbuf_gpa.byte_count = qp->buf_size; + pkt->ext_data.qpbuf_gpa.byte_offset = offset_in_page(qp->qp_buf); + user_va_init_pfn(&pkt->ext_data.qpbuf_gpa.pfn_array[0], qp->umem); + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt->hdr, qp_pkt_size, (u64)pkt); + + if (ret) + goto cr_qp_err; + + /* + * Copy the necessary response from the host. + */ + qp->qp_handle = pkt->ioctl.resrc_desc.handle; + + qp->qpn = pkt->mappings.qp_resp.qpn; + qp->max_send_wr = pkt->mappings.qp_resp.max_send_wr; + qp->max_recv_wr = pkt->mappings.qp_resp.max_recv_wr; + qp->max_send_sge = pkt->mappings.qp_resp.max_send_sge; + qp->max_recv_sge = pkt->mappings.qp_resp.max_recv_sge; + + + hvnd_debug("qp->max_send_wr=%d max_recv_wr=%d max_send_sge=%d max_recv_sge=%d max_inline_data=%d\n", + qp->max_send_wr, qp->max_recv_wr, qp->max_send_sge, + qp->max_recv_sge, qp->max_inline_data); + + ret = insert_handle(nd_dev, &nd_dev->qpidr, qp, qp->qpn); + + if (ret) + goto cr_qp_err; + + hvnd_debug("QP create after success qpn:%d qp:%p handle:%llu\n", + qp->qpn, qp, qp->qp_handle); + +cr_qp_err: + kfree(pkt); + return ret; +} + +int hvnd_free_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + struct hvnd_qp *qp) +{ + int ret; + + ret = hvnd_free_handle(nd_dev, uctx, qp->qp_handle, IOCTL_ND_QP_FREE); + if (ret == 0) + remove_handle(nd_dev, &nd_dev->qpidr, qp->qpn); + return ret; +} + +int hvnd_flush_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + struct hvnd_qp *qp) +{ + struct pkt_nd_flush_qp pkt; + int ret; + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + IOCTL_ND_QP_FLUSH, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.version = ND_VERSION_1; + pkt.ioctl.in.reserved = 0; + pkt.ioctl.in.handle = qp->qp_handle; + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + return 0; + +err: + return ret; +} + + +int hvnd_bind_nic(struct hvnd_dev *nd_dev, bool un_bind, char *ip_addr, + char *mac_addr) +{ + int ret; + int pkt_type = NDV_PKT_ID1_BIND; + + /* + * Send the bind information over to the host. + * For now, we will have a single ip and MAC address that we + * will deal with. Down the road we will need to expand support + * for multiple IP and MAC addresses and also deal with changing + * IP addresses. + */ + + NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); + hvnd_debug("bind packet type is %d ID:%d\n", + pkt_type, NDV_PACKET_TYPE_ID(pkt_type)); + nd_dev->bind_pkt.pkt_hdr.packet_type = pkt_type; + + nd_dev->bind_pkt.pkt_hdr.hdr_sz = sizeof(struct ndv_pkt_hdr_bind_1); + hvnd_debug("bind packet size is %d\n", + (int)sizeof(struct ndv_pkt_hdr_bind_1)); + nd_dev->bind_pkt.pkt_hdr.data_sz = 0; + nd_dev->bind_pkt.unbind = un_bind; + nd_dev->bind_pkt.ip_address.address_family = AF_INET; + nd_dev->bind_pkt.ip_address.ipv4.sin_family = AF_INET; + nd_dev->bind_pkt.ip_address.ipv4.sin_port = 0; + nd_dev->bind_pkt.ip_address.ipv4.sin_addr.s_addr = *(unsigned int *)ip_addr; + + nd_dev->bind_pkt.phys_addr.length = ETH_ALEN; + memcpy(nd_dev->bind_pkt.phys_addr.addr, mac_addr, ETH_ALEN); + + /* + * This is the adapter handle; needs to be unique for each + * MAC, ip address tuple. + */ + nd_dev->bind_pkt.guest_id = (u64)nd_dev; + + ret = hvnd_send_packet(nd_dev, &nd_dev->bind_pkt, + sizeof(struct ndv_pkt_hdr_bind_1), + (u64)NULL, + true); + return ret; +} + +int hvnd_init_resources(struct hvnd_dev *nd_dev) +{ + unsigned long mmio_sz; + struct resource *resrc; + int ret = -ENOMEM; + + resrc = &iomem_resource; + + mmio_sz = nd_dev->hvdev->channel->offermsg.offer.mmio_megabytes * + 1024 * 1024; + nd_dev->mmio_sz = mmio_sz; + nd_dev->mmio_resource.name = KBUILD_MODNAME; + nd_dev->mmio_resource.flags = IORESOURCE_MEM | IORESOURCE_BUSY; + + ret = allocate_resource(resrc, &nd_dev->mmio_resource, + mmio_sz, 0, -1, mmio_sz, NULL, NULL); + + if (ret) { + hvnd_error("Unable to allocate mmio resources\n"); + return ret; + } + hvnd_debug("MMIO start is %p\n", (void *)nd_dev->mmio_resource.start); + + /* + * Send the mmio information over to the host. + */ + nd_dev->resources.pkt_hdr.packet_type = NDV_PKT_ID1_INIT_RESOURCES; + nd_dev->resources.pkt_hdr.hdr_sz = sizeof(union ndv_packet_hdr); + nd_dev->resources.pkt_hdr.data_sz = 0; + + nd_dev->resources.io_space_sz_mb = mmio_sz; + nd_dev->resources.io_space_start = nd_dev->mmio_resource.start; + + ret = hvnd_send_packet(nd_dev, &nd_dev->resources, + sizeof(struct ndv_pkt_hdr_init_resources_1), + (u64)NULL, + true); + return ret; +} + +int hvnd_query_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx) +{ + struct pkt_nd_query_adaptor *pkt; + int ret; + int pkt_type; + struct ib_device_attr *props = &nd_dev->ibdev.attrs; + struct adapter_info_v2 *adap_info = &nd_dev->query_pkt.ioctl.ad_info; + + hvnd_debug("Performing Adapter query nd_dev=%p\n", nd_dev); + + // check if there is a need to do query + if (nd_dev->query_pkt_set) + return 0; + + // need a lock, multiple process can call this at the same time + + down(&nd_dev->query_pkt_sem); + if (nd_dev->query_pkt_set) { + up(&nd_dev->query_pkt_sem); + return 0; + } + + /* + * Now query the adaptor. + */ + + pkt = &nd_dev->query_pkt; + + pkt_type = NDV_PKT_ID1_CONTROL; + NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); + + pkt->hdr.pkt_hdr.packet_type = pkt_type; + pkt->hdr.pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); + pkt->hdr.pkt_hdr.data_sz = sizeof(struct pkt_nd_query_adaptor) - + sizeof(struct ndv_packet_hdr_control_1); + + + pkt->hdr.file_handle.local = uctx->file_handle.local; + pkt->hdr.file_handle.remote = uctx->file_handle.remote; + + pkt->hdr.irp_handle.val64 = 0; + + pkt->hdr.io_cntrl_code = IOCTL_ND_ADAPTER_QUERY; + pkt->hdr.output_buf_sz = sizeof(struct nd_adap_query_ioctl); + pkt->hdr.input_buf_sz = sizeof(struct nd_adap_query_ioctl); + pkt->hdr.input_output_buf_offset = 0; + memset(&pkt->ioctl.ad_q, 0, sizeof(struct nd_adap_query_ioctl)); + + pkt->ioctl.ad_q.version = ND_VERSION_1; + pkt->ioctl.ad_q.info_version = ND_VERSION_2; + pkt->ioctl.ad_q.adapter_handle = uctx->adaptor_hdl; + + ret = hvnd_send_packet(nd_dev, pkt, + sizeof(struct pkt_nd_query_adaptor), + (unsigned long)pkt, true); + + hvnd_debug("pkt->ioctl.ad_info.inline_request_threshold=%d\n", + pkt->ioctl.ad_info.inline_request_threshold); + nd_dev->query_pkt_set = true; + + // how about host returning PENDING + up(&nd_dev->query_pkt_sem); + + if (ret) + return ret; + + hvnd_debug("Query Adaptor Succeeded\n"); + + /* + * Cache the relevant properties out. + */ + props->fw_ver = 0; + props->device_cap_flags = 0; + props->vendor_id = 0x15b3; + props->vendor_part_id = adap_info->device_id; + props->max_mr_size = ~0ull; + props->page_size_cap = PAGE_SIZE; + props->max_qp = 16384; + props->max_qp_wr = min(adap_info->max_recv_q_depth, + adap_info->max_initiator_q_depth); + props->max_sge = min(adap_info->max_initiator_sge, + adap_info->max_recv_sge); + props->max_cq = 0x1FFFF; + props->max_cqe = adap_info->max_completion_q_depth; + props->max_mr = 16384; + props->max_pd = 16384; + props->max_qp_rd_atom = adap_info->max_inbound_read_limit; + props->max_qp_init_rd_atom = adap_info->max_outbound_read_limit; + props->max_res_rd_atom = props->max_qp_rd_atom * props->max_qp; + props->max_srq = 16384; + props->max_srq_wr = adap_info->max_recv_q_depth; + props->max_srq_sge = adap_info->max_recv_sge; + return 0; +} + + +int hvnd_create_pd(struct hvnd_ucontext *uctx, struct hvnd_dev *nd_dev, + struct hvnd_ib_pd *hvnd_pd) +{ + struct pkt_nd_pd_create *pkt = &uctx->pd_cr_pkt; + int ret; + int pkt_type; + + hvnd_debug("Create Protection Domain\n"); + + pkt_type = NDV_PKT_ID1_CONTROL; + NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); + + pkt->hdr.pkt_hdr.packet_type = pkt_type; + pkt->hdr.pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); + pkt->hdr.pkt_hdr.data_sz = sizeof(struct pkt_nd_pd_create) - + sizeof(struct ndv_packet_hdr_control_1); + + hvnd_debug("pdcreate packet size: %d\n", + (int)sizeof(struct pkt_nd_pd_create)); + hvnd_debug("pdcreate hdr size: %d\n", + (int)sizeof(struct ndv_packet_hdr_control_1)); + hvnd_debug("pdcreate data size: %d\n", pkt->hdr.pkt_hdr.data_sz); + + pkt->hdr.file_handle.local = uctx->create_pkt.handle.local; + pkt->hdr.file_handle.remote = uctx->create_pkt.handle.remote; + + hvnd_debug("create pd uctx is %p\n", uctx); + hvnd_debug("create pd local file is %d\n", + uctx->create_pkt.handle.local); + hvnd_debug("create pd local file is %d\n", + uctx->create_pkt.handle.remote); + + pkt->hdr.irp_handle.val64 = 0; + pkt->hdr.io_cntrl_code = IOCTL_ND_PD_CREATE; + + pkt->hdr.output_buf_sz = sizeof(struct nd_create_pd_ioctl); + pkt->hdr.input_buf_sz = sizeof(struct nd_create_pd_ioctl); + pkt->hdr.input_output_buf_offset = 0; + + hvnd_debug("output/input buf size: %d\n", pkt->hdr.output_buf_sz); + + /* + * Fill the ioctl section. + */ + pkt->ioctl.in.version = ND_VERSION_1; + pkt->ioctl.in.reserved = 0; + pkt->ioctl.in.handle = uctx->adaptor_hdl; + + + ret = hvnd_send_packet(nd_dev, pkt, + sizeof(struct pkt_nd_pd_create), + (unsigned long)pkt, true); + + if (ret) + return ret; + + if (pkt->hdr.pkt_hdr.status != 0) { + hvnd_error("Create PD failed; status is %d\n", + pkt->hdr.pkt_hdr.status); + return -EINVAL; + } + if (pkt->hdr.io_status != 0) { + hvnd_error("Create PD failed;io status is %d\n", + pkt->hdr.io_status); + return -EINVAL; + } + + hvnd_debug("Create PD Succeeded\n"); + + hvnd_debug("pd_handle is %p\n", (void *)pkt->ioctl.resp.pd_handle); + hvnd_debug("pdn is %d\n", (int)pkt->ioctl.resp.pdn); + + hvnd_pd->pdn = pkt->ioctl.resp.pdn; + hvnd_pd->handle = pkt->ioctl.out_handle; + + return 0; +} + +int hvnd_cancel_io(struct hvnd_ep_obj *ep_object) +{ + struct pkt_nd_cancel_io pkt; + int ret; + u32 ioctl; + + switch (ep_object->type) { + case ND_LISTENER: + hvnd_debug("LISTENER I/O Cancelled\n"); + ioctl = IOCTL_ND_LISTENER_CANCEL_IO; + break; + case ND_CONNECTOR: + hvnd_debug("CONNECTOR I/O Cancelled\n"); + ioctl = IOCTL_ND_CONNECTOR_CANCEL_IO; + break; + case ND_MR: + hvnd_debug("MR I/O Cancelled\n"); + ioctl = IOCTL_ND_MR_CANCEL_IO; + break; + case ND_CQ: + hvnd_debug("CQ I/O Cancelled\n"); + ioctl = IOCTL_ND_CQ_CANCEL_IO; + break; + default: + hvnd_error("UNKNOWN object type\n"); + return -EINVAL; + } + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + ep_object->uctx->create_pkt.handle.local, + ep_object->uctx->create_pkt.handle.remote, + ioctl, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.version = ND_VERSION_1; + pkt.ioctl.in.reserved = 0; + pkt.ioctl.in.handle = ep_object->ep_handle; + hvnd_debug("cancel io handle is %p\n", (void *)ep_object->ep_handle); + + ret = hvnd_send_ioctl_pkt(ep_object->nd_dev, &pkt.hdr, + sizeof(pkt), + (u64)&pkt); + + if (ret) + goto err; + + /* + * Now that we have cancelled all I/Os, + */ + + return 0; + +err: + hvnd_error("cancel I/O operation failed\n"); + return ret; +} + + +int hvnd_free_handle(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 handle, u32 ioctl) +{ + struct pkt_nd_free_handle pkt; + int ret; + + hvnd_debug("Freeing handle ioctl is %s; handle is %p\n", + hvnd_get_op_name(ioctl), (void *)handle); + + hvnd_debug("uctx is %p\n", uctx); + hvnd_debug("nd_dev is %p\n", nd_dev); + + // KYS try to avoid having to zero everything + memset(&pkt, 0, sizeof(pkt)); + hvnd_init_hdr(&pkt.hdr, + sizeof(pkt) - + sizeof(struct ndv_packet_hdr_control_1), + uctx->create_pkt.handle.local, + uctx->create_pkt.handle.remote, + ioctl, 0, 0, 0); + + /* + * Now fill in the ioctl section. + */ + pkt.ioctl.in.version = ND_VERSION_1; + pkt.ioctl.in.reserved = 0; + pkt.ioctl.in.handle = handle; + + ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); + + if (ret) + goto err; + + return 0; + +err: + hvnd_error("%s: ret=%d\n", __func__, ret); + return ret; +} + +int hvnd_negotiate_version(struct hvnd_dev *nd_dev) +{ + union ndv_packet_init *pkt = &nd_dev->init_pkt; + int ret; + + nd_dev->negotiated_version = NDV_PROTOCOL_VAERSION_INVALID; + + pkt->packet_type = NDV_PACKET_TYPE_INIT; + pkt->protocol_version = NDV_PROTOCOL_VERSION_CURRENT; + pkt->flags = 0; // KYS are the flags 0? + + ret = hvnd_send_packet(nd_dev, pkt, + sizeof(union ndv_packet_init), (u64)NULL, true); + + return ret; +} + +void hvnd_callback(void *context) +{ + struct hv_device *dev = context; + struct hvnd_dev *nd_dev = hv_get_drvdata(dev); + int copy_sz = 0; + struct ndv_packet_hdr_control_1 *ctrl_hdr; + union ndv_packet_init *pkt_init; + u32 recvlen; + u32 local_irp; + u64 requestid; + u32 *pkt_type; + u32 pkt_id; + struct hvnd_ep_obj *ep_object; + struct incoming_pkt *incoming_pkt; /* Used only for asynch calls */ + char *incoming_pkt_start; + struct vmpacket_descriptor *desc; + int status; + struct hvnd_cookie *hvnd_cookie; + unsigned long flags; + + vmbus_recvpacket_raw(dev->channel, hvnd_recv_buffer, + (PAGE_SIZE * 4), &recvlen, &requestid); + + if (recvlen <= 0) + return; + + desc = (struct vmpacket_descriptor *)hvnd_recv_buffer; + incoming_pkt_start = hvnd_recv_buffer + (desc->offset8 << 3); + recvlen -= desc->offset8 << 3; + + pkt_type = (u32 *)incoming_pkt_start; + pkt_id = *pkt_type; + if (pkt_id != NDV_PACKET_TYPE_INIT) + pkt_id = NDV_PACKET_TYPE_ID(pkt_id); + + switch (pkt_id) { + case NDV_PACKET_TYPE_INIT: + /* + * Host is responding to our init packet. + */ + pkt_init = (union ndv_packet_init *)incoming_pkt_start; + nd_dev->negotiated_version = pkt_init->protocol_version; + copy_sz = 0; + break; + + case NDV_PKT_ID1_INIT_RESOURCES: + copy_sz = 0; + break; + + case NDV_PKT_ID1_BIND: + nd_dev->bind_pkt.pkt_hdr.status = + ((union ndv_packet_hdr *) incoming_pkt_start)->status; + copy_sz = 0; + break; + + case NDV_PKT_ID1_COMPLETE: + ctrl_hdr = (struct ndv_packet_hdr_control_1 *)incoming_pkt_start; + status = ctrl_hdr->io_status; + + local_irp = ctrl_hdr->irp_handle.local; + ep_object = (struct hvnd_ep_obj *)map_irp_to_ctx(nd_dev, + local_irp); + + if (!ep_object) { + hvnd_error("irp could not be mapped; irp is %d ioctl is %s", + local_irp, + hvnd_get_op_name(ctrl_hdr->io_cntrl_code)); + goto complete; + } + + if (ctrl_hdr->io_cntrl_code != IOCTL_ND_CQ_NOTIFY) + hvnd_debug("completion packet; iostatus is %x, ioctl is %s", + ctrl_hdr->io_status, + hvnd_get_op_name(ctrl_hdr->io_cntrl_code)); + + switch (ctrl_hdr->io_cntrl_code) { + + case IOCTL_ND_CQ_NOTIFY: + hvnd_process_cq_event_complete(ep_object, status); + + ep_del_work_pending(ep_object); + goto complete; + + case IOCTL_ND_CONNECTOR_ACCEPT: + + hvnd_process_connector_accept(ep_object, status); + + ep_del_work_pending(ep_object); + goto complete; + + case IOCTL_ND_CONNECTOR_DISCONNECT: + hvnd_debug("disconnected: ep opj is %p; status: %d\n", + ep_object, status); + hvnd_process_disconnect(ep_object, status); + + ep_del_work_pending(ep_object); + goto complete; + + default: + break; + } + + /* + * This is the completion notification; + * the IRP cookie is the state through which + * we will invoke the callback. + */ + incoming_pkt = kmalloc(recvlen + sizeof(struct incoming_pkt), + GFP_ATOMIC); + if (incoming_pkt == NULL) { + hvnd_error("Could not alloc memory in callback\n"); + ep_del_work_pending(ep_object); + goto complete; + } + memcpy(incoming_pkt->pkt, incoming_pkt_start, recvlen); + + spin_lock_irqsave(&ep_object->incoming_pkt_list_lock, flags); + list_add_tail(&incoming_pkt->list_entry, + &ep_object->incoming_pkt_list); + spin_unlock_irqrestore(&ep_object->incoming_pkt_list_lock, + flags); + + schedule_work(&ep_object->wrk.work); + + goto complete; + + case NDV_PKT_ID1_CREATE: + copy_sz = sizeof(struct ndv_pkt_hdr_create_1); + break; + + case NDV_PKT_ID1_CLEANUP: + copy_sz = sizeof(struct ndv_pkt_hdr_cleanup_1); + break; + + case NDV_PKT_ID1_CONTROL: + ctrl_hdr = (struct ndv_packet_hdr_control_1 *)incoming_pkt_start; + status = ctrl_hdr->io_status; + + if (ctrl_hdr->io_cntrl_code != IOCTL_ND_CQ_NOTIFY) + hvnd_debug("packet; iostatus is %x ioctl is %s", + ctrl_hdr->io_status, hvnd_get_op_name(ctrl_hdr->io_cntrl_code)); + + switch (ctrl_hdr->io_cntrl_code) { + + case IOCTL_ND_PROVIDER_INIT: + copy_sz = sizeof(struct pkt_nd_provider_ioctl); + break; + + case IOCTL_ND_PROVIDER_BIND_FILE: + copy_sz = sizeof(struct pkt_nd_provider_ioctl); + break; + + case IOCTL_ND_ADAPTER_OPEN: + copy_sz = sizeof(struct pkt_nd_open_adapter); + break; + + case IOCTL_ND_ADAPTER_CLOSE: + copy_sz = sizeof(struct pkt_nd_free_handle); + break; + + case IOCTL_ND_ADAPTER_QUERY: + copy_sz = sizeof(struct pkt_nd_query_adaptor); + break; + + case IOCTL_ND_PD_CREATE: + copy_sz = sizeof(struct pkt_nd_pd_create); + break; + + case IOCTL_ND_PD_FREE: + copy_sz = sizeof(struct pkt_nd_free_handle); + break; + + case IOCTL_ND_CQ_CREATE: + copy_sz = sizeof(struct pkt_nd_create_cq); + break; + + case IOCTL_ND_CQ_FREE: + copy_sz = sizeof(struct pkt_nd_free_cq); + break; + + case IOCTL_ND_CQ_NOTIFY: //FIXME check ep stop state + local_irp = ctrl_hdr->irp_handle.local; + ep_object = (struct hvnd_ep_obj *) + map_irp_to_ctx(nd_dev, local_irp); + if (!ep_object) { + hvnd_error("irp could not be mapped\n"); + goto complete; + return; + } + copy_sz = sizeof(struct pkt_nd_notify_cq); + hvnd_process_cq_event_pending(ep_object, status); + goto complete; + return; + + case IOCTL_ND_LISTENER_CREATE: + copy_sz = sizeof(struct pkt_nd_cr_listener); + break; + + case IOCTL_ND_LISTENER_FREE: + copy_sz = sizeof(struct pkt_nd_free_listener); + break; + + case IOCTL_ND_QP_FREE: + copy_sz = sizeof(struct pkt_nd_free_handle); + break; + + case IOCTL_ND_CONNECTOR_CANCEL_IO: + case IOCTL_ND_MR_CANCEL_IO: + case IOCTL_ND_CQ_CANCEL_IO: + case IOCTL_ND_LISTENER_CANCEL_IO: + copy_sz = sizeof(struct pkt_nd_cancel_io); + break; + + case IOCTL_ND_LISTENER_BIND: + copy_sz = sizeof(struct pkt_nd_bind_listener); + break; + + case IOCTL_ND_LISTENER_LISTEN: + copy_sz = sizeof(struct pkt_nd_listen_listener); + break; + + case IOCTL_ND_LISTENER_GET_ADDRESS: + copy_sz = sizeof(struct pkt_nd_get_addr_listener); + break; + + case IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST: + copy_sz = sizeof(struct pkt_nd_get_connection_listener); + goto complete; // non-block + + case IOCTL_ND_CONNECTOR_CREATE: + copy_sz = sizeof(struct pkt_nd_cr_connector); + break; + + case IOCTL_ND_CONNECTOR_FREE: + copy_sz = sizeof(struct pkt_nd_free_connector); + break; + + case IOCTL_ND_CONNECTOR_BIND: + copy_sz = sizeof(struct pkt_nd_free_connector); + break; + + // KYS: ALERT: ASYNCH Operation + case IOCTL_ND_CONNECTOR_CONNECT: + copy_sz = sizeof(struct pkt_nd_connector_connect); + goto complete; //non-block + + case IOCTL_ND_CONNECTOR_COMPLETE_CONNECT: + copy_sz = sizeof(struct pkt_nd_connector_connect_complete); + break; + + // KYS: ALERT: ASYNCH Operation + case IOCTL_ND_CONNECTOR_ACCEPT: + copy_sz = sizeof(struct pkt_nd_connector_accept); + goto complete; //non-block + + case IOCTL_ND_CONNECTOR_REJECT: + copy_sz = sizeof(struct pkt_nd_connector_reject); + break; + + case IOCTL_ND_CONNECTOR_GET_READ_LIMITS: + copy_sz = sizeof(struct pkt_nd_connector_get_rd_limits); + break; + + case IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA: + copy_sz = sizeof(struct pkt_nd_connector_get_priv_data); + break; + + case IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS: + copy_sz = sizeof(struct pkt_nd_connector_get_peer_addr); + break; + + case IOCTL_ND_CONNECTOR_GET_ADDRESS: + copy_sz = sizeof(struct pkt_nd_connector_get_addr); + break; + + // KYS: ALERT: ASYNCH Operation + case IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT: + copy_sz = sizeof(struct pkt_nd_connector_notify_disconnect); + goto complete; //non-block + + // KYS: ALERT: ASYNCH Operation + case IOCTL_ND_CONNECTOR_DISCONNECT: + hvnd_debug("IOCTL_ND_CONNECTOR_DISCONNECT\n"); + copy_sz = sizeof(struct pkt_nd_connector_notify_disconnect); + goto complete; // non-block + + case IOCTL_ND_QP_CREATE: + copy_sz = sizeof(struct pkt_nd_create_qp); + break; + + case IOCTL_ND_MR_CREATE: + copy_sz = sizeof(struct pkt_nd_create_mr); + break; + + case IOCTL_ND_MR_FREE: + copy_sz = sizeof(struct pkt_nd_free_handle); + break; + + case IOCTL_ND_MR_REGISTER: + copy_sz = sizeof(struct pkt_nd_register_mr); + break; + + case IOCTL_ND_MR_DEREGISTER: + copy_sz = sizeof(struct pkt_nd_deregister_mr); + break; + + case IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST: + copy_sz = sizeof(struct pkt_query_addr_list); + break; + + case IOCTL_ND_QP_FLUSH: + copy_sz = sizeof(struct pkt_nd_flush_qp); + break; + + default: + hvnd_warn("Got unknown ioctl: %d\n", + ctrl_hdr->io_cntrl_code); + copy_sz = 0; + break; + } + + break; + default: + hvnd_warn("Got an unknown packet type %d\n", *pkt_type); + break; + } + + hvnd_cookie = (struct hvnd_cookie *)requestid; + memcpy(hvnd_cookie->pkt, incoming_pkt_start, copy_sz); + complete(&hvnd_cookie->host_event); + +complete: + /* send out ioctl completion patcket */ + if (desc->flags & VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED) { + int retry = 5; + + while (true) { + int ret; + + ret = vmbus_sendpacket(dev->channel, NULL, 0, + requestid, VM_PKT_COMP, 0); + if (ret == 0) { + break; + } else if (ret == -EAGAIN) { + if (--retry == 0) { + hvnd_error("give up retrying send completion packet\n"); + break; + } + hvnd_warn("retrying send completion packet\n"); + udelay(100); + } else { + hvnd_error("unable to send completion packet ret=%d\n", + ret); + break; + } + } + } + +} diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/vmbus_rdma.h b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/vmbus_rdma.h new file mode 100644 index 000000000000..78e534fa8748 --- /dev/null +++ b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/vmbus_rdma.h @@ -0,0 +1,2202 @@ +/* + * Copyright (c) 2014, Microsoft Corporation. + * + * Author: + * K. Y. Srinivasan <kys@microsoft.com> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published + * by the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but + * WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or + * NON INFRINGEMENT. See the GNU General Public License for more + * details. + * + * Bug fixes/enhancements: Long Li <longli@microsoft.com> + */ + + +#ifndef _VMBUS_RDMA_H +#define _VMBUS_RDMA_H + + +#include <linux/in.h> +#include <linux/in6.h> +#include <rdma/ib_verbs.h> +#include <linux/idr.h> +#include <linux/if_ether.h> + +/* NetworkDirect version Numbers. + */ +#define ND_VERSION_1 0x1 +#define ND_VERSION_2 0x20000 + +#ifndef NDVER +#define NDVER ND_VERSION_2 +#endif + +#define ND_ADAPTER_FLAG_IN_ORDER_DMA_SUPPORTED 0x00000001 +#define ND_ADAPTER_FLAG_CQ_INTERRUPT_MODERATION_SUPPORTED 0x00000004 +#define ND_ADAPTER_FLAG_MULTI_ENGINE_SUPPORTED 0x00000008 +#define ND_ADAPTER_FLAG_CQ_RESIZE_SUPPORTED 0x00000100 +#define ND_ADAPTER_FLAG_LOOPBACK_CONNECTIONS_SUPPORTED 0x00010000 + +#define ND_CQ_NOTIFY_ERRORS 0 +#define ND_CQ_NOTIFY_ANY 1 +#define ND_CQ_NOTIFY_SOLICITED 2 + +#define ND_MR_FLAG_ALLOW_LOCAL_WRITE 0x00000001 +#define ND_MR_FLAG_ALLOW_REMOTE_READ 0x00000002 +#define ND_MR_FLAG_ALLOW_REMOTE_WRITE 0x00000005 +#define ND_MR_FLAG_RDMA_READ_SINK 0x00000008 +#define ND_MR_FLAG_DO_NOT_SECURE_VM 0x80000000 + +#define ND_OP_FLAG_SILENT_SUCCESS 0x00000001 +#define ND_OP_FLAG_READ_FENCE 0x00000002 +#define ND_OP_FLAG_SEND_AND_SOLICIT_EVENT 0x00000004 +#define ND_OP_FLAG_ALLOW_READ 0x00000008 +#define ND_OP_FLAG_ALLOW_WRITE 0x00000010 + +#if NDVER >= ND_VERSION_2 +#define ND_OP_FLAG_INLINE 0x00000020 +#endif + +#define ND_AF_INET6 23 +#define IF_MAX_ADDR_LENGTH 32 + +struct group_affinity { + u64 mask; //KYS: usually 0 + u16 group; // KYS usually -1 + u16 reserved[3]; +}; + +struct if_physical_addr { + u16 length; + u8 addr[IF_MAX_ADDR_LENGTH]; +}; + +struct adapter_info_v2 { + u32 info_version; + u16 vendor_id; + u16 device_id; + u64 adapter_id; + size_t max_registration_size; + size_t max_window_size; + u32 max_initiator_sge; + u32 max_recv_sge; + u32 max_read_sge; + u32 max_transfer_length; + u32 max_inline_data_size; + u32 max_inbound_read_limit; + u32 max_outbound_read_limit; + u32 max_recv_q_depth; + u32 max_initiator_q_depth; + u32 max_shared_recv_q_depth; + u32 max_completion_q_depth; + u32 inline_request_threshold; + u32 large_request_threshold; + u32 max_caller_data; + u32 max_callee_data; + u32 adapter_flags; +} __packed; + +struct nd2_adapter_info_32 { //KYS: Check what this is + u32 info_version; + u16 vendor_id; + u16 devic_id; + u64 adapter_id; + u32 max_registration_size; + u32 max_window_size; + u32 max_initiator_sge; + u32 max_recv_sge; + u32 max_read_sge; + u32 max_transfer_length; + u32 max_inline_data_size; + u32 max_inbound_read_limit; + u32 max_outbound_read_limit; + u32 max_recv_q_depth; + u32 max_initiator_q_depth; + u32 max_shared_recv_q_depth; + u32 max_completion_q_depth; + u32 inline_request_threshold; + u32 large_request_threshold; + u32 max_caller_data; + u32 max_callee_data; + u32 adapter_flags; +} __packed; + +enum nd2_request_type { + ND2_RT_RECEIVE, + ND2_RT_SEND, + ND2_RT_BIND, + ND2_RT_INVALIDATE, + ND2_RT_READ, + ND2_RT_WRITE +}; + +struct nd2_result { + u32 status; + u32 bytes_transferred; + void *qp_ctx; + void *request_ctx; + enum nd2_request_type request_type; +} __packed; + +struct nd2_sge { + void *buffer; + u32 buffer_length; + u32 mr_token; +} __packed; + +/* + * The communication with the host via ioctls using VMBUS + * as the transport. + */ + +#define ND_IOCTL_VERSION 1 + +enum nd_mapping_type { + ND_MAP_IOSPACE, + ND_MAP_MEMORY, + ND_MAP_MEMORY_COALLESCE, + ND_MAP_PAGES, + ND_MAP_PAGES_COALLESCE, + ND_UNMAP_IOSPACE, + ND_UNMAP_MEMORY, + ND_MAX_MAP_TYPE +}; + +enum nd_caching_type { + ND_NON_CACHED = 0, + ND_CACHED, + ND_WRITE_COMBINED, + ND_MAX_CACHE_TYPE +}; + +enum nd_aceess_type { + ND_READ_ACCESS = 0, + ND_WRITE_ACCESS, + ND_MODIFY_ACCESS +}; + +struct nd_map_io_space { + enum nd_mapping_type map_type; + enum nd_caching_type cache_type; + u32 cb_length; +}; + +struct nd_map_memory { + enum nd_mapping_type map_type; + enum nd_aceess_type access_type; + u64 address; + u32 cb_length; +}; + +struct nd_mapping_id { + enum nd_mapping_type map_type; + u64 id; +}; + +struct ndk_map_pages { + struct nd_map_memory header; + u32 page_offset; +}; + +union nd_mapping { + enum nd_mapping_type map_type; + struct nd_map_io_space map_io_space; + struct nd_map_memory map_memory; + struct nd_mapping_id mapping_id; + struct ndk_map_pages map_pages; +}; + +struct nd_mapping_result { + u64 id; + u64 info; +}; + +struct nd_resource_descriptor { + u64 handle; + u32 ce_mapping_results; + u32 cb_mapping_results_offset; +}; + +struct nd_handle { + u32 version; + u32 reserved; + u64 handle; +}; + +union nd_sockaddr_inet { + struct sockaddr_in ipv4; + struct sockaddr_in6 ipv6; + u16 address_family; //KYS how is this supposed to work? +}; + +struct nd_address_element { + union nd_sockaddr_inet addr; + char mac_addr[ETH_ALEN]; +}; + +struct nd_resolve_address { + u32 version; + u32 reserved; + union nd_sockaddr_inet address; +}; + +struct nd_open_adapter { + u32 version; + u32 reserved; + u32 ce_mapping_cnt; + u32 cb_mapping_offset; + u64 adapter_id; +}; + +struct nd_adapter_query { + u32 version; + u32 info_version; + u64 adapter_handle; +}; + +struct nd_create_cq { + u32 version; + u32 queue_depth; + u32 ce_mapping_cnt; + u32 cb_mapping_offset; + u64 adapter_handle; + struct group_affinity affinity; +}; + +struct nd_create_srq { + u32 version; + u32 queue_depth; + u32 ce_mapping_cnt; + u32 cb_mapping_offset; + u32 max_request_sge; + u32 notify_threshold; + u64 pd_handle; + struct group_affinity affinity; +}; + +struct nd_create_qp_hdr { + u32 version; + u32 cb_max_inline_data; + u32 ce_mapping_cnt; + u32 cb_mapping_offset; //KYS: what is this prefix - ce/cb + u32 initiator_queue_depth; + u32 max_initiator_request_sge; + u64 receive_cq_handle; + u64 initiator_cq_handle; + u64 pd_handle; +}; + +struct nd_create_qp { + struct nd_create_qp_hdr hdr; + u32 receive_queue_depth; + u32 max_receive_request_sge; +}; + +struct nd_create_qp_with_srq { + struct nd_create_qp_hdr header; + u64 srq_handle; +}; + +struct nd_srq_modify { + u32 version; + u32 queue_depth; + u32 ce_mapping_cnt; + u32 cb_mapping_offset; + u32 notify_threshold; + u32 reserved; + u64 srq_handle; +}; + +struct nd_cq_modify { + u32 version; + u32 queue_depth; + u32 ce_mapping_count; + u32 cb_mappings_offset; + u64 cq_handle; +}; + +struct nd_cq_notify { + u32 version; + u32 type; + u64 cq_handle; +}; + +struct nd_mr_register_hdr { + u32 version; + u32 flags; + u64 cb_length; + u64 target_addr; + u64 mr_handle; +}; + +struct nd_mr_register { + struct nd_mr_register_hdr header; + u64 address; +}; + +struct nd_bind { + u32 version; + u32 reserved; + u64 handle; + union nd_sockaddr_inet address; +}; + +struct nd_read_limits { + u32 inbound; + u32 outbound; +}; + +struct nd_connect { + u32 version; + u32 reserved; + struct nd_read_limits read_limits; + u32 cb_private_data_length; + u32 cb_private_data_offset; + u64 connector_handle; + u64 qp_handle; + union nd_sockaddr_inet destination_address; + struct if_physical_addr phys_addr; +}; + +struct nd_accept { + u32 version; + u32 reserved; + struct nd_read_limits read_limits; + u32 cb_private_data_length; + u32 cb_private_data_offset; + u64 connector_handle; + u64 qp_handle; +}; + +struct nd_reject { + u32 version; + u32 reserved; + u32 cb_private_data_length; + u32 cb_private_data_offset; + u64 connector_handle; +}; + +struct nd_listen { + u32 version; + u32 back_log; + u64 listener_handle; +}; + +struct nd_get_connection_request { + u32 version; + u32 reserved; + u64 listener_handle; + u64 connector_handle; +}; + +enum ndv_mmio_type { + ND_PARTITION_KERNEL_VIRTUAL, + ND_PARTITION_SYSTEM_PHYSICAL, + ND_PARTITION_GUEST_PHYSICAL, + ND_MAXIMUM_MMIO_TYPE +}; + +struct ndv_resolve_adapter_id { + u32 version; + struct if_physical_addr phys_addr; +}; + +struct ndv_partition_create { + u32 version; + enum ndv_mmio_type mmio_type; + u64 adapter_id; + u64 xmit_cap; +}; + +struct ndv_partition_bind_luid { + u32 version; + u32 reserved; + u64 partition_handle; + struct if_physical_addr phys_addr; + //IF_LUID luid; //KYS? +}; + +struct ndv_partition_bind_address { + u32 version; + u32 reserved; + u64 partition_handle; + union nd_sockaddr_inet address; + struct if_physical_addr guest_phys_addr; + struct if_physical_addr phys_addr; +}; + +struct ndk_mr_register { + struct nd_mr_register_hdr hdr; + u32 cb_logical_page_addresses_offset; +}; + +struct ndk_bind { + struct nd_bind hdr; + u64 authentication_id; + bool is_admin; +}; + +#define FDN 0x12 +#define METHOD_BUFFERED 0x0 +#define FAA 0x0 + +#define CTL_CODE( DeviceType, Function, Method, Access ) ( \ + ((DeviceType) << 16) | ((Access) << 14) | ((Function) << 2) | (Method) \ +) + +#define ND_FUNCTION(r_, i_) ((r_) << 6 | (i_)) +#define IOCTL_ND(r_, i_) \ + CTL_CODE( FDN, ND_FUNCTION((r_), (i_)), METHOD_BUFFERED, FAA ) + +#define ND_FUNCTION_FROM_CTL_CODE(ctrlCode_) ((ctrlCode_ >> 2) & 0xFFF) +#define ND_RESOURCE_FROM_CTL_CODE(ctrlCode_) (ND_FUNCTION_FROM_CTL_CODE(ctrlCode_) >> 6) +#define ND_OPERATION_FROM_CTRL_CODE(ctrlCode_) (ND_FUNCTION_FROM_CTL_CODE(ctrlCode_) & 0x3F) + +#define ND_DOS_DEVICE_NAME L"\\DosDevices\\Global\\NetworkDirect" +#define ND_WIN32_DEVICE_NAME L"\\\\.\\NetworkDirect" + +enum nd_resource_type { + ND_PROVIDER = 0, + ND_ADAPTER, + ND_PD, + ND_CQ, + ND_MR, + ND_MW, + ND_SRQ, + ND_CONNECTOR, + ND_LISTENER, + ND_QP, + ND_VIRTUAL_PARTITION, + ND_RESOURCE_TYPE_COUNT +}; + +#define ND_OPERATION_COUNT 14 + +#define IOCTL_ND_PROVIDER(i_) IOCTL_ND(ND_PROVIDER, i_) +#define IOCTL_ND_ADAPTER(i_) IOCTL_ND(ND_ADAPTER, i_) +#define IOCTL_ND_PD(i_) IOCTL_ND(ND_PD, i_) +#define IOCTL_ND_CQ(i_) IOCTL_ND(ND_CQ, i_) +#define IOCTL_ND_MR(i_) IOCTL_ND(ND_MR, i_) +#define IOCTL_ND_MW(i_) IOCTL_ND(ND_MW, i_) +#define IOCTL_ND_SRQ(i_) IOCTL_ND(ND_SRQ, i_) +#define IOCTL_ND_CONNECTOR(i_) IOCTL_ND(ND_CONNECTOR, i_) +#define IOCTL_ND_LISTENER(i_) IOCTL_ND(ND_LISTENER, i_) +#define IOCTL_ND_QP(i_) IOCTL_ND(ND_QP, i_) +#define IOCTL_ND_VIRTUAL_PARTITION(i_) IOCTL_ND(ND_VIRTUAL_PARTITION, i_) + +/* Provider IOCTLs */ +#define IOCTL_ND_PROVIDER_INIT IOCTL_ND_PROVIDER( 0 ) +#define IOCTL_ND_PROVIDER_BIND_FILE IOCTL_ND_PROVIDER( 1 ) +#define IOCTL_ND_PROVIDER_QUERY_ADDRESS_LIST IOCTL_ND_PROVIDER( 2 ) +#define IOCTL_ND_PROVIDER_RESOLVE_ADDRESS IOCTL_ND_PROVIDER( 3 ) +#define IOCTL_ND_PROVIDER_MAX_OPERATION 4 + +/* Adapter IOCTLs */ +#define IOCTL_ND_ADAPTER_OPEN IOCTL_ND_ADAPTER( 0 ) +#define IOCTL_ND_ADAPTER_CLOSE IOCTL_ND_ADAPTER( 1 ) +#define IOCTL_ND_ADAPTER_QUERY IOCTL_ND_ADAPTER( 2 ) +#define IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST IOCTL_ND_ADAPTER( 3 ) +#define IOCTL_ND_ADAPTER_MAX_OPERATION 4 + +/* Protection Domain IOCTLs */ +#define IOCTL_ND_PD_CREATE IOCTL_ND_PD( 0 ) +#define IOCTL_ND_PD_FREE IOCTL_ND_PD( 1 ) +#define IOCTL_ND_PD_MAX_OPERATION 2 + +/* Completion Queue IOCTLs */ +#define IOCTL_ND_CQ_CREATE IOCTL_ND_CQ( 0 ) +#define IOCTL_ND_CQ_FREE IOCTL_ND_CQ( 1 ) +#define IOCTL_ND_CQ_CANCEL_IO IOCTL_ND_CQ( 2 ) +#define IOCTL_ND_CQ_GET_AFFINITY IOCTL_ND_CQ( 3 ) +#define IOCTL_ND_CQ_MODIFY IOCTL_ND_CQ( 4 ) +#define IOCTL_ND_CQ_NOTIFY IOCTL_ND_CQ( 5 ) +#define IOCTL_ND_CQ_MAX_OPERATION 6 + +/* Memory Region IOCTLs */ +#define IOCTL_ND_MR_CREATE IOCTL_ND_MR( 0 ) +#define IOCTL_ND_MR_FREE IOCTL_ND_MR( 1 ) +#define IOCTL_ND_MR_CANCEL_IO IOCTL_ND_MR( 2 ) +#define IOCTL_ND_MR_REGISTER IOCTL_ND_MR( 3 ) +#define IOCTL_ND_MR_DEREGISTER IOCTL_ND_MR( 4 ) +#define IOCTL_NDK_MR_REGISTER IOCTL_ND_MR( 5 ) +#define IOCTL_ND_MR_MAX_OPERATION 6 + +/* Memory Window IOCTLs */ +#define IOCTL_ND_MW_CREATE IOCTL_ND_MW( 0 ) +#define IOCTL_ND_MW_FREE IOCTL_ND_MW( 1 ) +#define IOCTL_ND_MW_MAX_OPERATION 2 + +/* Shared Receive Queue IOCTLs */ +#define IOCTL_ND_SRQ_CREATE IOCTL_ND_SRQ( 0 ) +#define IOCTL_ND_SRQ_FREE IOCTL_ND_SRQ( 1 ) +#define IOCTL_ND_SRQ_CANCEL_IO IOCTL_ND_SRQ( 2 ) +#define IOCTL_ND_SRQ_GET_AFFINITY IOCTL_ND_SRQ( 3 ) +#define IOCTL_ND_SRQ_MODIFY IOCTL_ND_SRQ( 4 ) +#define IOCTL_ND_SRQ_NOTIFY IOCTL_ND_SRQ( 5 ) +#define IOCTL_ND_SRQ_MAX_OPERATION 6 + +/* Connector IOCTLs */ +#define IOCTL_ND_CONNECTOR_CREATE IOCTL_ND_CONNECTOR( 0 ) +#define IOCTL_ND_CONNECTOR_FREE IOCTL_ND_CONNECTOR( 1 ) +#define IOCTL_ND_CONNECTOR_CANCEL_IO IOCTL_ND_CONNECTOR( 2 ) +#define IOCTL_ND_CONNECTOR_BIND IOCTL_ND_CONNECTOR( 3 ) +#define IOCTL_ND_CONNECTOR_CONNECT IOCTL_ND_CONNECTOR( 4 ) +#define IOCTL_ND_CONNECTOR_COMPLETE_CONNECT IOCTL_ND_CONNECTOR( 5 ) +#define IOCTL_ND_CONNECTOR_ACCEPT IOCTL_ND_CONNECTOR( 6 ) +#define IOCTL_ND_CONNECTOR_REJECT IOCTL_ND_CONNECTOR( 7 ) +#define IOCTL_ND_CONNECTOR_GET_READ_LIMITS IOCTL_ND_CONNECTOR( 8 ) +#define IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA IOCTL_ND_CONNECTOR( 9 ) +#define IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS IOCTL_ND_CONNECTOR( 10 ) +#define IOCTL_ND_CONNECTOR_GET_ADDRESS IOCTL_ND_CONNECTOR( 11 ) +#define IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT IOCTL_ND_CONNECTOR( 12 ) +#define IOCTL_ND_CONNECTOR_DISCONNECT IOCTL_ND_CONNECTOR( 13 ) +#define IOCTL_ND_CONNECTOR_MAX_OPERATION 14 + +/* Listener IOCTLs */ +#define IOCTL_ND_LISTENER_CREATE IOCTL_ND_LISTENER( 0 ) +#define IOCTL_ND_LISTENER_FREE IOCTL_ND_LISTENER( 1 ) +#define IOCTL_ND_LISTENER_CANCEL_IO IOCTL_ND_LISTENER( 2 ) +#define IOCTL_ND_LISTENER_BIND IOCTL_ND_LISTENER( 3 ) +#define IOCTL_ND_LISTENER_LISTEN IOCTL_ND_LISTENER( 4 ) +#define IOCTL_ND_LISTENER_GET_ADDRESS IOCTL_ND_LISTENER( 5 ) +#define IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST IOCTL_ND_LISTENER( 6 ) +#define IOCTL_ND_LISTENER_MAX_OPERATION 7 + +/* Queue Pair IOCTLs */ +#define IOCTL_ND_QP_CREATE IOCTL_ND_QP( 0 ) +#define IOCTL_ND_QP_CREATE_WITH_SRQ IOCTL_ND_QP( 1 ) +#define IOCTL_ND_QP_FREE IOCTL_ND_QP( 2 ) +#define IOCTL_ND_QP_FLUSH IOCTL_ND_QP( 3 ) +#define IOCTL_ND_QP_MAX_OPERATION 4 + +/* Kernel-mode only IOCTLs (IRP_MJ_INTERNAL_DEVICE_CONTROL) */ +#define IOCTL_NDV_PARTITION_RESOLVE_ADAPTER_ID IOCTL_ND_VIRTUAL_PARTITION( 0 ) +#define IOCTL_NDV_PARTITION_CREATE IOCTL_ND_VIRTUAL_PARTITION( 1 ) +#define IOCTL_NDV_PARTITION_FREE IOCTL_ND_VIRTUAL_PARTITION( 2 ) +#define IOCTL_NDV_PARTITION_BIND IOCTL_ND_VIRTUAL_PARTITION( 3 ) +#define IOCTL_NDV_PARTITION_UNBIND IOCTL_ND_VIRTUAL_PARTITION( 4 ) +#define IOCTL_NDV_PARTITION_BIND_LUID IOCTL_ND_VIRTUAL_PARTITION( 5 ) +#define IOCTL_NDV_PARTITION_MAX_OPERATION 6 + + +#define MB_SHIFT 20 + + +/* Ringbuffer size for the channel */ +#define NDV_NUM_PAGES_IN_RING_BUFFER 64 + +#define NDV_MAX_PACKETS_PER_RECEIVE 8 + +#define NDV_MAX_PACKET_COUNT 16304 + +#define NDV_MAX_NUM_OUTSTANDING_RECEIVED_PACKETS (16304) +#define NDV_MAX_HANDLE_TABLE_SIZE (16304) +#define NDV_HOST_MAX_HANDLE_TABLE_SIZE (NDV_MAX_HANDLE_TABLE_SIZE * 16) + + +#define NDV_MAX_MAPPINGS 4 + +#define NDV_STATE_NONE 0x00000000 +#define NDV_STATE_CREATED 0x00000001 +#define NDV_STATE_CONNECTING 0x00000002 +#define NDV_STATE_INITIALIZING 0x00000003 +#define NDV_STATE_OPERATIONAL 0xEFFFFFFF +#define NDV_STATE_FAILED 0xFFFFFFFF + + +#define NDV_MAX_PRIVATE_DATA_SIZE 64 +#define NDV_MAX_IOCTL_SIZE 256 + +/* max size of buffer for vector of ND_MAPPING */ +#define NDV_MAX_MAPPING_BUFFER_SIZE \ + (NDV_MAX_MAPPINGS * sizeof(union nd_mapping)) + +/* max expected ioctl buffer size from users */ +#define NDV_MAX_IOCTL_BUFFER_SIZE \ + (NDV_MAX_IOCTL_SIZE + \ + NDV_MAX_MAPPING_BUFFER_SIZE + \ + NDV_MAX_PRIVATE_DATA_SIZE) + +/* max PFN array for inline buffers */ +#define NDV_MAX_INLINE_PFN_ARRAY_LENGTH 32 + +/* Field header size for inline buffer */ +#define NDV_MAX_MAPPING_PACKET_FILED_BUFFER_SIZE \ + (NDV_MAX_MAPPINGS * sizeof(NDV_PACKET_FIELD)) + +/* Max for a single field */ + +#define NDV_MAX_SINGLE_MAPPING_FIELD ( sizeof(GPA_RANGE) + \ + (sizeof(PFN_NUMBER) * NDV_MAX_INLINE_PFN_ARRAY_LENGTH)) + +/* Max for all inine data */ + +#define NDV_MAX_MAPPING_DATA_SIZE (NDV_MAX_MAPPING_PACKET_FILED_BUFFER_SIZE + \ + (NDV_MAX_MAPPINGS * NDV_MAX_SINGLE_MAPPING_FIELD)) + + +#define NDV_MAX_PACKET_HEADER_SIZE 256 + +#define NDV_MAX_PACKET_SIZE (NDV_MAX_PACKET_HEADER_SIZE + \ + NDV_MAX_IOCTL_BUFFER_SIZE + \ + NDV_MAX_MAPPING_DATA_SIZE) + +/* Well known message type INIT is defined for the channel + * not for the protocol. + */ + +#define NDV_PACKET_TYPE_INIT 0xFFFFFFFF + +/* Invalid protocol version to to identify uninitialized channels */ + +#define NDV_PROTOCOL_VERSION_INVALID 0xFFFFFFFF + +/* Flags that control the bahavior of packet handling */ + +enum ndv_packet_options { + NDV_PACKET_OPTION_NONE = 0x00, + + /* Indicates that the ExternalDataMdl parameter is expectected to be + * passed and must be handled in the reciever. This call must be + * handled specially to ensure that the MDL can be created correctly. + */ + NDV_PACKET_OPTION_EXTERNAL_DATA = 0x01, + + /* Inicates that the reciever must execution the handler at passive. */ + NDV_PACKET_OPTIONS_REQUIRES_PASSIVE = 0x02, + + /* Indicates that the sender does not expect and is not waiting for a + * response packet. + */ + NDV_PACKET_OPTIONS_POST = 0x04, +}; + +#define NDV_PACKET_TYPE(id_, opt_) \ + (((opt_)<<24) | (id_)) + +#define NDV_PACKET_TYPE_OPTIONS(type_) \ + (((type_) >> 24) & 0xFF) + +#define NDV_PACKET_TYPE_ID(type_) \ + ((type_) & 0xFFFFFF) \ + +#define NDV_ADD_PACKET_OPTION(type_, opt_) \ + (type_) |= (opt_<<24) + +/* The header value sent on all packets */ +union ndv_packet_hdr { + + struct { + /* The type of packet. + * This value should be created with the NDV_PACKET_TYPE macro + * to include all packet options within the packet type. + */ + u32 packet_type; + /* The size of the entire fixed message structure that exists + * before the data. This must be >= sizeof(NDV_PACKET_HEADER) + */ + u32 hdr_sz; + /* This size of the data that follows the message + * data_sz + hdr_sz size gives the total size of + * the buffer that is used. + */ + u32 data_sz; + /* The status code used to indicate success or failure. + * It is only used in completions and during responses. + */ + u32 status; //KYS: NTSTATUS? + }; + + u64 padding[2]; //KYS: why? +}; + + +/* The core INIT packet. This message is defined in the channel + * not in the protocol. This message should never change size + * or behavior, as it could impact compatibility in the future. + * This packet is used to negotiate the protocol version, so chaning + * this size could break backward compat. + */ + +union ndv_packet_init { + struct { + u32 packet_type; + u32 protocol_version; + u32 flags; + }; + u64 padding[2]; +} __packed; + +#define NDV_PACKET_INIT_SIZE 16 + +/* Data packing flags used for accessing the dynamic fields inside a packet */ +#define NDV_DATA_PACKING_2 0x1 +#define NDV_DATA_PACKING_4 0x3 +#define NDV_DATA_PACKING_8 0x7 + + +#define NDV_PROTOCOL_VERSION_1 0x0100 +#define NDV_PROTOCOL_VERSION_CURRENT NDV_PROTOCOL_VERSION_1 +#define NDV_PROTOCOL_VERSION_COUNT 1 + +struct ndv_pkt_field { + u32 size; + u32 offset; +}; + +enum ndv_pkt_id { + NDV_PKT_UNKNOWN = 0, + /* Version 1 Message ID's */ + NDV_PKT_ID1_BIND, + NDV_PKT_ID1_CREATE, + NDV_PKT_ID1_CLEANUP, + NDV_PKT_ID1_CANCEL, + NDV_PKT_ID1_CONTROL, + NDV_PKT_ID1_COMPLETE, + NDV_PKT_ID1_INIT_RESOURCES, +}; + +/* The guest will send this as the first messages just after init + * The resources are reserved per channel. + */ + +struct ndv_pkt_hdr_init_resources_1 { + + union ndv_packet_hdr pkt_hdr; + u16 io_space_sz_mb; + u64 io_space_start; + +}; + + + +/* The guest will send this packet to the host after channel init + * to query support for the adapters that are registered. + */ + +struct ndv_pkt_hdr_bind_1 { + union ndv_packet_hdr pkt_hdr; + bool unbind; + union nd_sockaddr_inet ip_address; + struct if_physical_addr phys_addr; + u64 guest_id; +}; + +union ndv_context_handle { + u64 val64; + struct { + u32 local; + u32 remote; + }; +}; + +struct ndv_pkt_hdr_create_1 { + union ndv_packet_hdr pkt_hdr; + + /* Identifies the object used to track this file handle on both + * the guest and the host. When sent from the guest, it will contain + * the guest handle. On success, the host will populate and return + * it's handle value as well. + */ + + union ndv_context_handle handle; + + /* The parameters sent to the CreateFile call */ + u32 access_mask; + u32 open_options; + + u16 file_attributes; //KYS: This field must be 64 bit aligned + + u16 share_access; //KYS + + u32 kys_padding; //KYS + + u16 ea_length; //KYS; needs to be 64 bit aligned; what is ea length - unused +}; + + +struct ndv_pkt_hdr_cleanup_1 { + union ndv_packet_hdr pkt_hdr; + + /* Identifies the object used to track this file handle on both + * the guest and the host. When sent from the guest, it will contain + * the both the guest and host handle values. The host will use this + * value to cleanup its resource, then update its portion of the handle + * to NDV_HANDLE_NULL before returning the data back to the guest. + */ + union ndv_context_handle handle; +}; + +struct ndv_pkt_hdr_cancel_1 { + union ndv_packet_hdr pkt_hdr; + union ndv_context_handle file_handle; + union ndv_context_handle irp_handle; +}; + +struct ndv_bind_port_info { + //LUID authentication_id; //KYS: LUID? + bool is_admin; +}; + +struct ndv_extended_data_flds { + union { + u32 field_count; + u64 padding; + }; + + //struct ndv_pkt_field fields[ANYSIZE_ARRAY]; //KYS? +}; + + +struct ndv_packet_hdr_control_1 { + union ndv_packet_hdr pkt_hdr; + /* Identifies the object used to track this file handle on both + * the guest and the host. This should always have both guest + * and host handle values inside it. + */ + + union ndv_context_handle file_handle; + + /* The handle information for the allocated irp context object. + * This information is used when the host/guest starts the cancelation + */ + union ndv_context_handle irp_handle; + + /* The input data describing in the IO control parameters */ + + u32 io_cntrl_code; + u32 output_buf_sz; + u32 input_buf_sz; + u32 input_output_buf_offset; + + /* These are used in the return message to indicate the status of the IO + * operation and the amount of data written to the output buffer. + */ + u32 io_status; //KYS: NTSTATUS? + u32 bytes_returned; + + /* This contains the field information for additional data that is sent + * with the packet that is IOCTL specific. + */ + + struct ndv_pkt_field extended_data; +}; + +/* + * Include MLX specific defines. + */ + +#include "mx_abi.h" + +/* Driver specific state. + */ + +/* + * We need to have host open a file; some + * Windows constants for open. + */ +#define STANDARD_RIGHTS_ALL (0x001F0000L) +#define FILE_ATTRIBUTE_NORMAL (0x80) +#define FILE_SHARE_READ (0x00000001) +#define FILE_SHARE_WRITE (0x00000002) +#define FILE_SHARE_DELETE (0x00000004) +#define FILE_FLAG_OVERLAPPED (0x40000000) +#define FILE_SHARE_ALL (FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE) +#define CREATE_ALWAYS (2) +#define OPEN_EXISTING (3) + +#define RTL_NUMBER_OF(_x) \ + sizeof(_x)/sizeof(_x[0]) +/* + * The context structure tracks the open state. + */ + +/* + * Packet layout for open adaptor. + */ + +/* + * Packet for querying the address list. + */ + +union query_addr_list_ioctl { + struct nd_handle in; + union nd_sockaddr_inet out[16]; //KYS a max of 16 addresses +}; + +struct pkt_query_addr_list { + struct ndv_packet_hdr_control_1 hdr; + union query_addr_list_ioctl ioctl; + unsigned long activity_id; +}; + + +struct pkt_fld { + u32 size; + u32 offset; +}; + +struct fld_data { + union { + u64 padding; + }; +}; + +struct extended_data_oad { + union { + u32 cnt; + u64 padding; + }; + /* offsets are from start of extended data struct + * and should start on 8 byte boundary + */ + struct pkt_fld fields[IBV_GET_CONTEXT_MAPPING_MAX]; +}; + +union oad_ioctl { + struct nd_open_adapter input; + struct nd_resource_descriptor resrc_desc; +}; + +union oad_mappings { + struct ibv_get_context_req ctx_input; + struct ibv_get_context_resp ctx_output; +}; + +struct pkt_nd_open_adapter { + struct ndv_packet_hdr_control_1 hdr; + + union oad_ioctl ioctl; + union oad_mappings mappings; + + /* + * Extended data. + */ + struct extended_data_oad ext_data; +}; + +/* + * Create CQ IOCTL. + */ + +struct cq_db_gpa { + u32 byte_count; + u32 byte_offset; + u64 pfn_array[2]; +}; + +struct cq_sn_gpa { + u32 byte_count; + u32 byte_offset; + u64 pfn_array[2]; +}; + +struct create_cq_ext_data { + union { + u32 cnt; + u64 padding; + }; + /* offsets are from start of extended data struct + * and should start on 8 byte boundary + */ + struct pkt_fld fields[MLX4_IB_CREATE_CQ_MAPPING_MAX]; + struct cq_db_gpa db_gpa; + struct cq_sn_gpa sn_gpa; + struct gpa_range cqbuf_gpa; +}; + +union create_cq_ioctl { + struct nd_create_cq input; + struct nd_resource_descriptor resrc_desc; +}; + +union create_cq_mappings { + struct ibv_create_cq cq_in; + struct ibv_create_cq_resp cq_resp; +}; + +struct pkt_nd_create_cq { + struct ndv_packet_hdr_control_1 hdr; + + union create_cq_ioctl ioctl; + union create_cq_mappings mappings; + + /* + * Extended data. + */ + struct create_cq_ext_data ext_data; +}; + +/* + * IOCTL to free CQ. + */ +struct free_cq_ioctl { + struct nd_handle in; +}; + +struct pkt_nd_free_cq { + struct ndv_packet_hdr_control_1 hdr; + + struct free_cq_ioctl ioctl; +}; + + +/* + * IOCTL to QUERY CQ - CQ NOTIFY + */ + +struct notify_cq_ioctl { + struct nd_cq_notify in; +}; + +struct pkt_nd_notify_cq { + struct ndv_packet_hdr_control_1 hdr; + struct notify_cq_ioctl ioctl; +}; + +/* + * IOCTL to Create a listner + */ + +struct nd_ep_create { + struct nd_handle hdr; + bool to_semantics; + unsigned long activity_id; +}; + +union listener_cr_ioctl { + struct nd_ep_create in; + u64 out; +}; + +struct pkt_nd_cr_listener { + struct ndv_packet_hdr_control_1 hdr; + union listener_cr_ioctl ioctl; +}; + +/* + * IOCTL to free listener. + */ + +struct listener_free_ioctl { + struct nd_handle in; +}; + +struct pkt_nd_free_listener { + struct ndv_packet_hdr_control_1 hdr; + struct listener_free_ioctl ioctl; +}; + +/* + * IOCTL for listener cancel IO. + */ +struct listener_cancelio_ioctl { + struct nd_handle in; +}; + +struct pkt_nd_cancelio_listener { + struct ndv_packet_hdr_control_1 hdr; + struct listener_cancelio_ioctl ioctl; +}; + +/* + * IOCTL for LISTENER BIND + */ + +union listener_bind_ioctl { + struct ndk_bind in; +}; + +struct pkt_nd_bind_listener { + struct ndv_packet_hdr_control_1 hdr; + union listener_bind_ioctl ioctl; +}; + +/* + * After the listener is bound, enable + * listening. + */ + +union listener_listen_ioctl { + struct nd_listen in; +}; + +struct pkt_nd_listen_listener { + struct ndv_packet_hdr_control_1 hdr; + union listener_listen_ioctl ioctl; +}; + +/* + * IOCTL for getting the adddress from listener. + * + */ + +union listener_get_addr_ioctl { + struct nd_handle in; + union nd_sockaddr_inet out; +}; + +struct pkt_nd_get_addr_listener { + struct ndv_packet_hdr_control_1 hdr; + union listener_get_addr_ioctl ioctl; +}; + +/* + * IOCTL to get a connection from a listener. + */ + +union listener_get_connection_ioctl { + struct nd_get_connection_request in; + union nd_sockaddr_inet out; +}; + +struct pkt_nd_get_connection_listener { + struct ndv_packet_hdr_control_1 hdr; + union listener_get_connection_ioctl ioctl; +}; + + +/* + * Connector IOCTLs + */ + +/* + * IOCTL to create connector. + */ + +union connector_cr_ioctl { //KYS should this be a union or struct? + struct nd_ep_create in; + u64 out; +}; + +struct pkt_nd_cr_connector { + struct ndv_packet_hdr_control_1 hdr; + union connector_cr_ioctl ioctl; //KYS: union or struct +}; + +/* + * IOCTL to free connector. + */ + +struct connector_free_ioctl { + struct nd_handle in; +}; + +struct pkt_nd_free_connector { + struct ndv_packet_hdr_control_1 hdr; + struct connector_free_ioctl ioctl; +}; + +/* + * IOCTL to cancel I/O on a connector. + */ + +struct connector_cancelio_ioctl { + struct nd_handle in; +}; + +struct pkt_nd_cancelio_connector { + struct ndv_packet_hdr_control_1 hdr; + struct connector_cancelio_ioctl ioctl; +}; + +/* + * IOCTL to Bind an address to the connector. + */ + +union connector_bind_ioctl { + struct ndk_bind in; +}; + +struct pkt_nd_bind_connector { + struct ndv_packet_hdr_control_1 hdr; + union connector_bind_ioctl ioctl; +}; + +/* + * IOCTL to connect a connector. + */ + +struct connector_connect_in { + struct nd_connect hdr; + u8 retry_cnt; + u8 rnr_retry_cnt; + u8 priv_data[56]; + unsigned long activity_id; +}; + +union connector_connect_ioctl { + struct connector_connect_in in; +}; + +struct pkt_nd_connector_connect { + struct ndv_packet_hdr_control_1 hdr; + union connector_connect_ioctl ioctl; +}; + +/* + * IOCTL for connector complete connect + */ + +struct complete_connect_in { + struct nd_handle hdr; + u8 rnr_nak_to; + unsigned long activity_id; +}; + +struct complete_connect_out { + enum ibv_qp_state state; +}; + +union connector_complete_connect_ioctl { + struct complete_connect_in in; + struct complete_connect_out out; +}; + +struct pkt_nd_connector_connect_complete { + struct ndv_packet_hdr_control_1 hdr; + union connector_complete_connect_ioctl ioctl; +}; + + +#define MAX_PRIVATE_DATA_LEN 148 + +/* + * IOCTL for connector accept. + */ + +struct connector_accept_in { + struct nd_accept hdr; + u8 rnr_retry_cnt; + u8 rnr_nak_to; + u8 private_data[MAX_PRIVATE_DATA_LEN]; + unsigned long activity_id; +}; + +struct connector_accept_out { + enum ibv_qp_state state; +}; + +union connector_accept_ioctl { + struct connector_accept_in in; + struct connector_accept_out out; +}; + +struct pkt_nd_connector_accept { + struct ndv_packet_hdr_control_1 hdr; + union connector_accept_ioctl ioctl; +}; + +/* + * IOCTL for connector to reject a connection. + */ + +struct connector_reject_in { + struct nd_reject hdr; + u8 private_data[MAX_PRIVATE_DATA_LEN]; +}; + +struct connector_reject_out { + enum ibv_qp_state state; +}; + +union connector_reject_ioctl { + struct connector_reject_in in; + struct connector_reject_out out; +}; + +struct pkt_nd_connector_reject { + struct ndv_packet_hdr_control_1 hdr; + union connector_reject_ioctl ioctl; +}; + +/* + * IOCTL to get connector read limits. + */ + +struct connector_get_rd_limits_in { + struct nd_handle in; +}; + +struct connector_get_rd_limits_out { + struct nd_read_limits out; +}; + +union connector_get_rd_limits_ioctl { + struct connector_get_rd_limits_in in; + struct connector_get_rd_limits_out out; +}; + +struct pkt_nd_connector_get_rd_limits { + struct ndv_packet_hdr_control_1 hdr; + union connector_get_rd_limits_ioctl ioctl; +}; + +/* + * IOCTL to get connector private data. + */ +union connector_get_priv_data_ioctl { + struct nd_handle in; + u8 out[MAX_PRIVATE_DATA_LEN]; +}; + +struct pkt_nd_connector_get_priv_data { + struct ndv_packet_hdr_control_1 hdr; + union connector_get_priv_data_ioctl ioctl; +}; + + +/* + * IOCTL get peer address. + */ + +union connector_get_peer_addr_ioctl { + struct nd_handle in; + union nd_sockaddr_inet out; +}; + +struct pkt_nd_connector_get_peer_addr { + struct ndv_packet_hdr_control_1 hdr; + union connector_get_peer_addr_ioctl ioctl; +}; + +/* + * IOCTL to get connector address. + */ + +union connector_get_addr_ioctl { + struct nd_handle in; + union nd_sockaddr_inet out; +}; + +struct pkt_nd_connector_get_addr { + struct ndv_packet_hdr_control_1 hdr; + union connector_get_addr_ioctl ioctl; +}; + +/* + * IOCTL for disconnect notification. + */ + +union connector_notify_disconnect_ioctl { + struct nd_handle in; +}; + +struct pkt_nd_connector_notify_disconnect { + struct ndv_packet_hdr_control_1 hdr; + union connector_notify_disconnect_ioctl ioctl; +}; + +union connector_disconnect_ioctl { + struct nd_handle in; +}; + +struct pkt_nd_connector_disconnect { + struct ndv_packet_hdr_control_1 hdr; + union connector_notify_disconnect_ioctl ioctl; +}; + +/* + * IOCTLs for QP operations. + */ + +/* + * Create qp IOCTL. + */ + +struct qp_db_gpa { + u32 byte_count; + u32 byte_offset; + u64 pfn_array[1]; +}; + +struct create_qp_ext_data { + union { + u32 cnt; + u64 padding; + }; + /* offsets are from start of extended data struct + * and should start on 8 byte boundary + */ + struct pkt_fld fields[MLX4_IB_CREATE_QP_MAPPINGS_MAX]; + struct qp_db_gpa db_gpa; + struct gpa_range qpbuf_gpa; +}; + +union create_qp_ioctl { + struct nd_create_qp input; + struct nd_resource_descriptor resrc_desc; +}; + +union create_qp_mappings { + struct ibv_create_qp qp_in; + struct ibv_create_qp_resp qp_resp; +}; + +struct pkt_nd_create_qp { + struct ndv_packet_hdr_control_1 hdr; + + union create_qp_ioctl ioctl; + union create_qp_mappings mappings; + + /* + * Extended data. + */ + struct create_qp_ext_data ext_data; +}; + +/* + * IOCTL to flush a QP. + */ +struct flush_qp_ioctl { + struct nd_handle in; + enum ibv_qp_state out; +}; + +struct pkt_nd_flush_qp { + struct ndv_packet_hdr_control_1 hdr; + struct flush_qp_ioctl ioctl; +}; + +/* + * Memory Region IOCTLS + */ +union create_mr_ioctl { + struct nd_handle in; + u64 out; +}; + +struct pkt_nd_create_mr { + struct ndv_packet_hdr_control_1 hdr; + union create_mr_ioctl ioctl; +}; + +struct mr_out { + u32 lkey; + u32 rkey; + unsigned long activity_id; +}; + + +union register_mr_ioctl { + struct nd_mr_register in; + struct mr_out out; +}; + +struct pkt_nd_register_mr { + struct ndv_packet_hdr_control_1 hdr; + union register_mr_ioctl ioctl; +}; + +struct deregister_mr_ioctl { + struct nd_handle in; +}; + +struct pkt_nd_deregister_mr { + struct ndv_packet_hdr_control_1 hdr; + struct deregister_mr_ioctl ioctl; +}; + +/* + * IOCTL to disconnect connector + */ + +/* + * Create PD IOCTL. + */ +struct nd_create_pd_ioctl { + union { + struct nd_handle in; + u64 out_handle; + }; + struct ibv_alloc_pd_resp resp; +}; + +struct pkt_nd_pd_create { + struct ndv_packet_hdr_control_1 hdr; + struct nd_create_pd_ioctl ioctl; +}; + +/* + * Free Handle. Check the layout with Luke. + * + */ +struct free_handle_ioctl { + struct nd_handle in; +}; + +struct pkt_nd_free_handle { + struct ndv_packet_hdr_control_1 hdr; + struct free_handle_ioctl ioctl; +}; + +/* + * Cancel I/O. + */ + +struct cancel_io_ioctl { + struct nd_handle in; +}; + +struct pkt_nd_cancel_io { + struct ndv_packet_hdr_control_1 hdr; + struct cancel_io_ioctl ioctl; +}; + +/* + * Connector states: + */ + +enum connector_state { + HVND_CON_INCOMING, + HVND_CON_INCOMING_ESTABLISHED, + HVND_CON_INCOMING_REJECTED, + HVND_CON_OUTGOING_REQUEST +}; + + +/* + * Adaptor query IOCTL. + */ +struct nd_adap_query_ioctl { + union { + struct nd_adapter_query ad_q; + struct adapter_info_v2 ad_info; + }; +}; + +struct pkt_nd_query_adaptor { + struct ndv_packet_hdr_control_1 hdr; + struct nd_adap_query_ioctl ioctl; +}; + +struct nd_ioctl { + union { + struct nd_handle handle; + u8 raw_buffer[NDV_MAX_IOCTL_BUFFER_SIZE]; + }; +}; + +struct pkt_nd_provider_ioctl { + struct ndv_packet_hdr_control_1 hdr; + struct nd_ioctl ioctl; +}; + +struct hvnd_ib_pd { + struct ib_pd ibpd; + u32 pdn; + u64 handle; +}; + +struct hvnd_work { + struct work_struct work; + void *callback_arg; +}; + +struct hvnd_disconnect_work { + struct work_struct work; + int status; + void *callback_arg; +}; + +/* +struct hvnd_delayed_work { + struct delayed_work work; + void *callback_arg; +}; +*/ + +enum hvnd_cm_state { + hvnd_cm_idle = 0, + hvnd_cm_connect_reply_sent, //active + hvnd_cm_connect_reply_refused, + hvnd_cm_connect_received, //active + hvnd_cm_connect_request_sent, //passive + hvnd_cm_accept_sent, + hvnd_cm_close_sent, + hvnd_cm_established_sent, +}; + +struct incoming_pkt { + struct list_head list_entry; + char pkt[0]; +}; + +struct hvnd_ep_obj { +/* + spinlock_t ep_lk; + bool to_be_destroyed; + bool io_outstanding; + + wait_queue_head_t wait; + bool stopped; + atomic_t process_refcnt; // how many NDV_PKT_ID1_COMPLETE packets we are currently processing +*/ + bool stopping; + wait_queue_head_t wait_pending; + atomic_t nr_requests_pending; + + enum nd_resource_type type; + enum connector_state state; //KYS need to look at locking + struct iw_cm_id *cm_id; + enum hvnd_cm_state cm_state; + struct completion block_event; + struct completion disconnect_event; + struct completion connector_accept_event; + int connector_accept_status; + u64 ep_handle; + spinlock_t incoming_pkt_list_lock; + struct list_head incoming_pkt_list; + struct hvnd_ep_obj *parent; + struct hvnd_dev *nd_dev; + struct hvnd_ucontext *uctx; + struct hvnd_work wrk; + struct hvnd_cq *cq; + u8 ord; + u8 ird; + char priv_data[MAX_PRIVATE_DATA_LEN]; + bool incoming; + atomic_t disconnect_notified; + u64 outstanding_handle; + u32 local_irp; + struct hvnd_ep_obj *outstanding_ep; + struct pkt_nd_connector_connect connector_connect_pkt; + int connector_connect_retry; +}; + +struct hvnd_ucontext { + struct ib_ucontext ibucontext; + struct list_head listentry; + struct ndv_pkt_hdr_create_1 create_pkt; + struct ndv_pkt_hdr_create_1 create_pkt_ovl; /* Overlap handle */ + struct pkt_nd_provider_ioctl pr_init_pkt; + union ndv_context_handle file_handle; + union ndv_context_handle file_handle_ovl; + + struct pkt_nd_open_adapter o_adap_pkt; + + u64 adaptor_hdl; + + /* + * Protection domain state. + */ + struct pkt_nd_pd_create pd_cr_pkt; + + u64 uar_base; + u64 bf_base; + u32 bf_buf_size; + u32 bf_offset; + u32 cqe_size; + u32 max_qp_wr; + u32 max_sge; + u32 max_cqe; + u32 num_qps; + + /* + * State to manage dorbell pages: + */ + struct list_head db_page_list; + struct mutex db_page_mutex; + + atomic_t refcnt; + +}; + +struct hvnd_dev { + struct ib_device ibdev; + struct hv_device *hvdev; + u32 device_cap_flags; + unsigned char nports; + bool ib_active; + + /* State to manage interaction with the host. + */ + + spinlock_t uctxt_lk; + struct list_head listentry; + + unsigned long mmio_sz; + unsigned long mmio_start_addr; + struct resource mmio_resource; + void *mmio_virt; + + unsigned long negotiated_version; + union ndv_packet_init init_pkt; + struct ndv_pkt_hdr_init_resources_1 resources; + struct ndv_pkt_hdr_bind_1 bind_pkt; + + struct ndv_pkt_hdr_create_1 global_create_pkt; + union ndv_context_handle global_file_handle; + + struct semaphore query_pkt_sem; + bool query_pkt_set; + struct pkt_nd_query_adaptor query_pkt; + + /* + * ID tables. + */ + spinlock_t id_lock; + + struct idr cqidr; + struct idr qpidr; + struct idr mmidr; + struct idr irpidr; + struct idr uctxidr; + atomic_t open_cnt; + + char ip_addr[4]; + char mac_addr[6]; + struct completion addr_set; + int bind_complete; + struct mutex bind_mutex; +}; + +struct hvnd_cq { + struct ib_cq ibcq; + void *cq_buf; + void *db_addr; + u32 arm_sn; + u32 entries; + + u32 cqn; + u32 cqe; + u64 cq_handle; + + struct ib_umem *umem; + struct ib_umem *db_umem; + struct mlx4_ib_user_db_page user_db_page; + struct hvnd_ucontext *uctx; + struct hvnd_ep_obj ep_object; //KYS need to clean this up; have a cq irp state + bool monitor; + bool upcall_pending; +}; + +struct hvnd_qp { + struct ib_qp ibqp; + void *qp_buf; + void *db_addr; + u32 buf_size; + u8 port; + struct hvnd_dev *nd_dev; + + __u8 log_sq_bb_count; + __u8 log_sq_stride; + __u8 sq_no_prefetch; + + int rq_wqe_cnt; + int rq_wqe_shift; + int rq_max_gs; + + int sq_wqe_cnt; + int sq_wqe_shift; + int sq_max_gs; + + u32 max_inline_data; + + u32 initiator_q_depth; + u32 initiator_request_sge; + + u32 receive_q_depth; + u32 receive_request_sge; + + struct hvnd_cq *recv_cq; + struct hvnd_cq *send_cq; + + u64 receive_cq_handle; + u64 initiator_cq_handle; + u64 pd_handle; + + u64 qp_handle; + u32 qpn; + u32 max_send_wr; + u32 max_recv_wr; + u32 max_send_sge; + u32 max_recv_sge; + + struct ib_umem *umem; + struct ib_umem *db_umem; + struct mlx4_ib_user_db_page user_db_page; + struct hvnd_ucontext *uctx; + struct iw_cm_id *cm_id; + + /* + * Current QP state; need to look at locking. + * XXXKYS + */ + enum ib_qp_state qp_state; + bool cq_notify; + wait_queue_head_t wait; + atomic_t refcnt; + struct hvnd_ep_obj *connector; +}; + +struct hvnd_mr { + struct ib_mr ibmr; + struct hvnd_ib_pd *pd; + struct ib_umem *umem; + u64 start; + u64 length; + u64 virt; + int acc; + u64 mr_handle; + u32 mr_lkey; + u32 mr_rkey; +}; + +struct hvnd_cookie { + struct completion host_event; + void *pkt; +}; + +/* + * Definitions to retrieve the IP address. + */ + +#define HVND_CURRENT_VERSION 0 + +struct hvnd_ipaddr_tuple { + char mac_address[ETH_ALEN]; + struct sockaddr addr; +}; + +struct hvnd_msg { + int status; + struct hvnd_ipaddr_tuple ip_tuple; +}; + +static inline struct hvnd_ib_pd *to_nd_pd(struct ib_pd *pd) +{ + return container_of(pd, struct hvnd_ib_pd, ibpd); +} + +static inline struct hvnd_dev *to_nd_dev(struct ib_device *ibdev) +{ + return container_of(ibdev, struct hvnd_dev, ibdev); +} + +static inline struct hvnd_cq *to_nd_cq(struct ib_cq *ibcq) +{ + return container_of(ibcq, struct hvnd_cq, ibcq); +} + +static inline struct hvnd_qp *to_nd_qp(struct ib_qp *ibqp) +{ + return container_of(ibqp, struct hvnd_qp, ibqp); +} + +static inline struct hvnd_ucontext *to_nd_context(struct ib_ucontext *ibucontext) +{ + return container_of(ibucontext, struct hvnd_ucontext, ibucontext); +} + +static inline struct hvnd_ucontext *get_uctx_from_pd(struct ib_pd *pd) +{ + return to_nd_context(pd->uobject->context); +} + +static inline struct hvnd_mr *to_nd_mr(struct ib_mr *ibmr) +{ + return container_of(ibmr, struct hvnd_mr, ibmr); +} +/* + * ID management. + */ + +static inline int insert_handle(struct hvnd_dev *dev, struct idr *idr, + void *handle, u32 id) +{ + int ret; + unsigned long flags; + + idr_preload(GFP_KERNEL); + spin_lock_irqsave(&dev->id_lock, flags); + + ret = idr_alloc(idr, handle, id, id + 1, GFP_ATOMIC); + + spin_unlock_irqrestore(&dev->id_lock, flags); + idr_preload_end(); + + BUG_ON(ret == -ENOSPC); + return ret < 0 ? ret : 0; +} + +static inline void remove_handle(struct hvnd_dev *dev, struct idr *idr, u32 id) +{ + unsigned long flags; + + spin_lock_irqsave(&dev->id_lock, flags); + idr_remove(idr, id); + spin_unlock_irqrestore(&dev->id_lock, flags); +} + +static inline struct hvnd_cq *get_cqp(struct hvnd_dev *dev, u32 cqid) +{ + struct hvnd_cq *cqp; + unsigned long flags; + + spin_lock_irqsave(&dev->id_lock, flags); + cqp = idr_find(&dev->cqidr, cqid); + spin_unlock_irqrestore(&dev->id_lock, flags); + + return cqp; +} + +static inline struct hvnd_qp *get_qpp(struct hvnd_dev *dev, u32 qpid) +{ + struct hvnd_qp *qpp; + unsigned long flags; + + spin_lock_irqsave(&dev->id_lock, flags); + qpp = idr_find(&dev->qpidr, qpid); + spin_unlock_irqrestore(&dev->id_lock, flags); + + return qpp; +} + +static inline struct hvnd_ucontext *get_uctx(struct hvnd_dev *dev, u32 pid) +{ + struct hvnd_ucontext *uctx; + unsigned long flags; + + spin_lock_irqsave(&dev->id_lock, flags); + uctx = idr_find(&dev->uctxidr, pid); + spin_unlock_irqrestore(&dev->id_lock, flags); + + return uctx; +} + + +static inline void *map_irp_to_ctx(struct hvnd_dev *nd_dev, u32 irp) +{ + void *ctx; + unsigned long flags; + + spin_lock_irqsave(&nd_dev->id_lock, flags); + ctx = idr_find(&nd_dev->irpidr, irp); + spin_unlock_irqrestore(&nd_dev->id_lock, flags); + + return ctx; +} + + + +void hvnd_callback(void *context); +int hvnd_negotiate_version(struct hvnd_dev *nd_dev); +int hvnd_init_resources(struct hvnd_dev *nd_dev); +int hvnd_bind_nic(struct hvnd_dev *nd_dev, bool un_bind, char *ip_addr, char *mac_addr); +int hvnd_open_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx); +int hvnd_close_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx); +int hvnd_query_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx); +int hvnd_create_pd(struct hvnd_ucontext *uctx, struct hvnd_dev *nd_dev, + struct hvnd_ib_pd *hvnd_pd); + +/* + * CQ operations. + */ +int hvnd_create_cq(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + struct hvnd_cq *cq); +int hvnd_destroy_cq(struct hvnd_dev *nd_dev, struct hvnd_cq *cq); +int hvnd_notify_cq(struct hvnd_dev *nd_dev, struct hvnd_cq *cq, + u32 notify_type, u64 irp_handle); + +/* + * QP operations. + */ +int hvnd_create_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + struct hvnd_qp *qp); + +int hvnd_free_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + struct hvnd_qp *qp); + +int hvnd_flush_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + struct hvnd_qp *qp); + +/* + * MR operations. + */ + +int hvnd_cr_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 pd_handle, u64 *mr_handle); + +int hvnd_free_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 handle); + +int hvnd_mr_register(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + struct hvnd_mr *mr); +int hvnd_deregister_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 handle); + +/* + * Listner operations + */ +int hvnd_cr_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, u64 *handle); + + +int hvnd_free_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 listener_handle); + +int hvnd_bind_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 listener_handle, union nd_sockaddr_inet *addr); + +int hvnd_listen_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 listener_handle, u32 backlog); + +int hvnd_get_addr_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 listener_handle, union nd_sockaddr_inet *addr); + +int hvnd_get_connection_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 listener_handle, u64 connector_handle, + u64 irp_handle); + +/* + * Connector operations. + */ +int hvnd_cr_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 *connector_handle); + +int hvnd_free_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 handle); + +int hvnd_cancelio_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 handle); +int hvnd_bind_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 handle, union nd_sockaddr_inet *addr); + +int hvnd_connector_connect(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 connector_handle, u32 in_rd_limit, u32 out_rd_limit, + u32 priv_data_length, const u8 *priv_data, + u64 qp_handle, struct if_physical_addr *phys_addr, + union nd_sockaddr_inet *dest_addr, struct hvnd_ep_obj *ep); + +int hvnd_connector_complete_connect(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 connector_handle, enum ibv_qp_state *qp_state); + +int hvnd_connector_accept(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 connector_handle, + u64 qp_handle, + u32 in_rd_limit, u32 out_rd_limit, + u32 priv_data_length, const u8 *priv_data, + enum ibv_qp_state *qp_state, struct hvnd_ep_obj *ep); + +int hvnd_connector_reject(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 connector_handle, + u32 priv_data_length, u8 *priv_data, + enum ibv_qp_state *qp_state); + +int hvnd_connector_get_rd_limits(struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx, + u64 connector_handle, + struct nd_read_limits *rd_limits); + +int hvnd_connector_get_priv_data(struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx, + u64 connector_handle, + u8 *priv_data); + +int hvnd_connector_get_peer_addr(struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx, + u64 connector_handle, + union nd_sockaddr_inet *peer_addr); + +int hvnd_connector_get_local_addr(struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx, + u64 connector_handle, + union nd_sockaddr_inet *local_addr); + +int hvnd_connector_notify_disconnect(struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx, + u64 connector_handle, struct hvnd_ep_obj *ep); + + +int hvnd_connector_disconnect(struct hvnd_dev *nd_dev, + struct hvnd_ucontext *uctx, + u64 connector_handle, struct hvnd_ep_obj *ep); + +int hvnd_free_handle(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + u64 handle, u32 ioctl); + +int hvnd_cancel_io(struct hvnd_ep_obj *ep_object); + +char *hvnd_get_op_name(int ioctl); + +void hvnd_acquire_uctx_ref(struct hvnd_ucontext *uctx); +void hvnd_drop_uctx_ref(struct hvnd_dev *nd_dev,struct hvnd_ucontext *uctx); +void hvnd_process_events(struct work_struct *work); + +void hvnd_process_cq_event_pending(struct hvnd_ep_obj *ep, int status); +void hvnd_process_cq_event_complete(struct hvnd_ep_obj *ep, int status); +void hvnd_process_connector_accept(struct hvnd_ep_obj *ep_object, int status); +void hvnd_process_notify_disconnect(struct hvnd_ep_obj *ep_object, int status); +void hvnd_process_disconnect(struct hvnd_ep_obj *ep_object, int status); + +void put_irp_handle(struct hvnd_dev *nd_dev, u32 irp); +int get_irp_handle(struct hvnd_dev *nd_dev, u32 *local, void *irp_ctx); + +void hvnd_init_hdr(struct ndv_packet_hdr_control_1 *hdr, + u32 data_sz, u32 local, u32 remote, + u32 ioctl_code, + u32 ext_data_sz, u32 ext_data_offset, + u64 irp_handle); + +int hvnd_send_ioctl_pkt(struct hvnd_dev *nd_dev, + struct ndv_packet_hdr_control_1 *hdr, + u32 pkt_size, u64 cookie); + +int hvnd_get_outgoing_rdma_addr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, + union nd_sockaddr_inet *og_addr); + +int hvnd_get_neigh_mac_addr(struct sockaddr *local, struct sockaddr *remote, char *mac_addr); + +void hvnd_addr_init(void); + +void hvnd_addr_deinit(void); + +bool ep_add_work_pending(struct hvnd_ep_obj *ep_object); +void ep_del_work_pending(struct hvnd_ep_obj *ep_object); +void ep_stop(struct hvnd_ep_obj *ep_object); + +#define current_pid() (current->pid) +/* + * NT STATUS defines. + */ + +#define STATUS_SUCCESS 0x0 +#define STATUS_PENDING 0x00000103 +#define STATUS_CANCELLED 0xC0000120 +#define STATUS_DISCONNECTED 0xC000020C +#define STATUS_TIMEOUT 0xC00000B5 + +void inc_ioctl_counter_request(unsigned ioctl); +void inc_ioctl_counter_response(unsigned ioctl); + +#define NDV_PROTOCOL_VAERSION_INVALID -1 +#define NDV_PACKET_INIT_SIZE 16 /* Size of the INIT packet */ + +#define HVND_RING_SZ (PAGE_SIZE * 64) + +/* logging levels */ +#define HVND_ERROR 0 +#define HVND_WARN 1 +#define HVND_INFO 2 +#define HVND_DEBUG 3 + +extern int hvnd_log_level; + +#define hvnd_error(fmt, args...) hvnd_log(HVND_ERROR, fmt, ##args) +#define hvnd_warn(fmt, args...) hvnd_log(HVND_WARN, fmt, ##args) +#define hvnd_info(fmt, args...) hvnd_log(HVND_INFO, fmt, ##args) +#define hvnd_debug(fmt, args...) hvnd_log(HVND_DEBUG, fmt, ##args) + +#define hvnd_log(level, fmt, args...) \ +do { \ + if (unlikely(hvnd_log_level >= (level))) \ + printk(KERN_ERR "hvnd %s[%u]: " fmt, __func__, __LINE__, ##args); \ +} while (0) + +#endif /* _VMBUS_RDMA_H */ diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/COPYING b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/COPYING deleted file mode 100644 index f0960c946b07..000000000000 --- a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/COPYING +++ /dev/null @@ -1,339 +0,0 @@ - -"This software program is licensed subject to the GNU General Public License -(GPL). Version 2, June 1991, available at -<http://www.fsf.org/copyleft/gpl.html>" - -GNU General Public License - -Version 2, June 1991 - -Copyright (C) 1989, 1991 Free Software Foundation, Inc. -51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA - -Everyone is permitted to copy and distribute verbatim copies of this license -document, but changing it is not allowed. - -Preamble - -The licenses for most software are designed to take away your freedom to -share and change it. By contrast, the GNU General Public License is intended -to guarantee your freedom to share and change free software--to make sure -the software is free for all its users. This General Public License applies -to most of the Free Software Foundation's software and to any other program -whose authors commit to using it. (Some other Free Software Foundation -software is covered by the GNU Library General Public License instead.) You -can apply it to your programs, too. - -When we speak of free software, we are referring to freedom, not price. Our -General Public Licenses are designed to make sure that you have the freedom -to distribute copies of free software (and charge for this service if you -wish), that you receive source code or can get it if you want it, that you -can change the software or use pieces of it in new free programs; and that -you know you can do these things. - -To protect your rights, we need to make restrictions that forbid anyone to -deny you these rights or to ask you to surrender the rights. These -restrictions translate to certain responsibilities for you if you distribute -copies of the software, or if you modify it. - -For example, if you distribute copies of such a program, whether gratis or -for a fee, you must give the recipients all the rights that you have. You -must make sure that they, too, receive or can get the source code. And you -must show them these terms so they know their rights. - -We protect your rights with two steps: (1) copyright the software, and (2) -offer you this license which gives you legal permission to copy, distribute -and/or modify the software. - -Also, for each author's protection and ours, we want to make certain that -everyone understands that there is no warranty for this free software. If -the software is modified by someone else and passed on, we want its -recipients to know that what they have is not the original, so that any -problems introduced by others will not reflect on the original authors' -reputations. - -Finally, any free program is threatened constantly by software patents. We -wish to avoid the danger that redistributors of a free program will -individually obtain patent licenses, in effect making the program -proprietary. To prevent this, we have made it clear that any patent must be -licensed for everyone's free use or not licensed at all. - -The precise terms and conditions for copying, distribution and modification -follow. - -TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION - -0. This License applies to any program or other work which contains a notice - placed by the copyright holder saying it may be distributed under the - terms of this General Public License. The "Program", below, refers to any - such program or work, and a "work based on the Program" means either the - Program or any derivative work under copyright law: that is to say, a - work containing the Program or a portion of it, either verbatim or with - modifications and/or translated into another language. (Hereinafter, - translation is included without limitation in the term "modification".) - Each licensee is addressed as "you". - - Activities other than copying, distribution and modification are not - covered by this License; they are outside its scope. The act of running - the Program is not restricted, and the output from the Program is covered - only if its contents constitute a work based on the Program (independent - of having been made by running the Program). Whether that is true depends - on what the Program does. - -1. You may copy and distribute verbatim copies of the Program's source code - as you receive it, in any medium, provided that you conspicuously and - appropriately publish on each copy an appropriate copyright notice and - disclaimer of warranty; keep intact all the notices that refer to this - License and to the absence of any warranty; and give any other recipients - of the Program a copy of this License along with the Program. - - You may charge a fee for the physical act of transferring a copy, and you - may at your option offer warranty protection in exchange for a fee. - -2. You may modify your copy or copies of the Program or any portion of it, - thus forming a work based on the Program, and copy and distribute such - modifications or work under the terms of Section 1 above, provided that - you also meet all of these conditions: - - * a) You must cause the modified files to carry prominent notices stating - that you changed the files and the date of any change. - - * b) You must cause any work that you distribute or publish, that in - whole or in part contains or is derived from the Program or any part - thereof, to be licensed as a whole at no charge to all third parties - under the terms of this License. - - * c) If the modified program normally reads commands interactively when - run, you must cause it, when started running for such interactive - use in the most ordinary way, to print or display an announcement - including an appropriate copyright notice and a notice that there is - no warranty (or else, saying that you provide a warranty) and that - users may redistribute the program under these conditions, and - telling the user how to view a copy of this License. (Exception: if - the Program itself is interactive but does not normally print such - an announcement, your work based on the Program is not required to - print an announcement.) - - These requirements apply to the modified work as a whole. If identifiable - sections of that work are not derived from the Program, and can be - reasonably considered independent and separate works in themselves, then - this License, and its terms, do not apply to those sections when you - distribute them as separate works. But when you distribute the same - sections as part of a whole which is a work based on the Program, the - distribution of the whole must be on the terms of this License, whose - permissions for other licensees extend to the entire whole, and thus to - each and every part regardless of who wrote it. - - Thus, it is not the intent of this section to claim rights or contest - your rights to work written entirely by you; rather, the intent is to - exercise the right to control the distribution of derivative or - collective works based on the Program. - - In addition, mere aggregation of another work not based on the Program - with the Program (or with a work based on the Program) on a volume of a - storage or distribution medium does not bring the other work under the - scope of this License. - -3. You may copy and distribute the Program (or a work based on it, under - Section 2) in object code or executable form under the terms of Sections - 1 and 2 above provided that you also do one of the following: - - * a) Accompany it with the complete corresponding machine-readable source - code, which must be distributed under the terms of Sections 1 and 2 - above on a medium customarily used for software interchange; or, - - * b) Accompany it with a written offer, valid for at least three years, - to give any third party, for a charge no more than your cost of - physically performing source distribution, a complete machine- - readable copy of the corresponding source code, to be distributed - under the terms of Sections 1 and 2 above on a medium customarily - used for software interchange; or, - - * c) Accompany it with the information you received as to the offer to - distribute corresponding source code. (This alternative is allowed - only for noncommercial distribution and only if you received the - program in object code or executable form with such an offer, in - accord with Subsection b above.) - - The source code for a work means the preferred form of the work for - making modifications to it. For an executable work, complete source code - means all the source code for all modules it contains, plus any - associated interface definition files, plus the scripts used to control - compilation and installation of the executable. However, as a special - exception, the source code distributed need not include anything that is - normally distributed (in either source or binary form) with the major - components (compiler, kernel, and so on) of the operating system on which - the executable runs, unless that component itself accompanies the - executable. - - If distribution of executable or object code is made by offering access - to copy from a designated place, then offering equivalent access to copy - the source code from the same place counts as distribution of the source - code, even though third parties are not compelled to copy the source - along with the object code. - -4. You may not copy, modify, sublicense, or distribute the Program except as - expressly provided under this License. Any attempt otherwise to copy, - modify, sublicense or distribute the Program is void, and will - automatically terminate your rights under this License. However, parties - who have received copies, or rights, from you under this License will not - have their licenses terminated so long as such parties remain in full - compliance. - -5. You are not required to accept this License, since you have not signed - it. However, nothing else grants you permission to modify or distribute - the Program or its derivative works. These actions are prohibited by law - if you do not accept this License. Therefore, by modifying or - distributing the Program (or any work based on the Program), you - indicate your acceptance of this License to do so, and all its terms and - conditions for copying, distributing or modifying the Program or works - based on it. - -6. Each time you redistribute the Program (or any work based on the - Program), the recipient automatically receives a license from the - original licensor to copy, distribute or modify the Program subject to - these terms and conditions. You may not impose any further restrictions - on the recipients' exercise of the rights granted herein. You are not - responsible for enforcing compliance by third parties to this License. - -7. If, as a consequence of a court judgment or allegation of patent - infringement or for any other reason (not limited to patent issues), - conditions are imposed on you (whether by court order, agreement or - otherwise) that contradict the conditions of this License, they do not - excuse you from the conditions of this License. If you cannot distribute - so as to satisfy simultaneously your obligations under this License and - any other pertinent obligations, then as a consequence you may not - distribute the Program at all. For example, if a patent license would - not permit royalty-free redistribution of the Program by all those who - receive copies directly or indirectly through you, then the only way you - could satisfy both it and this License would be to refrain entirely from - distribution of the Program. - - If any portion of this section is held invalid or unenforceable under any - particular circumstance, the balance of the section is intended to apply - and the section as a whole is intended to apply in other circumstances. - - It is not the purpose of this section to induce you to infringe any - patents or other property right claims or to contest validity of any - such claims; this section has the sole purpose of protecting the - integrity of the free software distribution system, which is implemented - by public license practices. Many people have made generous contributions - to the wide range of software distributed through that system in - reliance on consistent application of that system; it is up to the - author/donor to decide if he or she is willing to distribute software - through any other system and a licensee cannot impose that choice. - - This section is intended to make thoroughly clear what is believed to be - a consequence of the rest of this License. - -8. If the distribution and/or use of the Program is restricted in certain - countries either by patents or by copyrighted interfaces, the original - copyright holder who places the Program under this License may add an - explicit geographical distribution limitation excluding those countries, - so that distribution is permitted only in or among countries not thus - excluded. In such case, this License incorporates the limitation as if - written in the body of this License. - -9. The Free Software Foundation may publish revised and/or new versions of - the General Public License from time to time. Such new versions will be - similar in spirit to the present version, but may differ in detail to - address new problems or concerns. - - Each version is given a distinguishing version number. If the Program - specifies a version number of this License which applies to it and "any - later version", you have the option of following the terms and - conditions either of that version or of any later version published by - the Free Software Foundation. If the Program does not specify a version - number of this License, you may choose any version ever published by the - Free Software Foundation. - -10. If you wish to incorporate parts of the Program into other free programs - whose distribution conditions are different, write to the author to ask - for permission. For software which is copyrighted by the Free Software - Foundation, write to the Free Software Foundation; we sometimes make - exceptions for this. Our decision will be guided by the two goals of - preserving the free status of all derivatives of our free software and - of promoting the sharing and reuse of software generally. - - NO WARRANTY - -11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY - FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN - OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES - PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER - EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED - WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE - ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH - YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL - NECESSARY SERVICING, REPAIR OR CORRECTION. - -12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING - WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR - REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR - DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL - DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM - (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED - INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF - THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR - OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. - -END OF TERMS AND CONDITIONS - -How to Apply These Terms to Your New Programs - -If you develop a new program, and you want it to be of the greatest -possible use to the public, the best way to achieve this is to make it free -software which everyone can redistribute and change under these terms. - -To do so, attach the following notices to the program. It is safest to -attach them to the start of each source file to most effectively convey the -exclusion of warranty; and each file should have at least the "copyright" -line and a pointer to where the full notice is found. - -one line to give the program's name and an idea of what it does. -Copyright (C) yyyy name of author - -This program is free software; you can redistribute it and/or modify it -under the terms of the GNU General Public License as published by the Free -Software Foundation; either version 2 of the License, or (at your option) -any later version. - -This program is distributed in the hope that it will be useful, but WITHOUT -ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or -FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for -more details. - -You should have received a copy of the GNU General Public License along with -this program; if not, write to the Free Software Foundation, Inc., 59 -Temple Place - Suite 330, Boston, MA 02111-1307, USA. - -Also add information on how to contact you by electronic and paper mail. - -If the program is interactive, make it output a short notice like this when -it starts in an interactive mode: - -Gnomovision version 69, Copyright (C) year name of author Gnomovision comes -with ABSOLUTELY NO WARRANTY; for details type 'show w'. This is free -software, and you are welcome to redistribute it under certain conditions; -type 'show c' for details. - -The hypothetical commands 'show w' and 'show c' should show the appropriate -parts of the General Public License. Of course, the commands you use may be -called something other than 'show w' and 'show c'; they could even be -mouse-clicks or menu items--whatever suits your program. - -You should also get your employer (if you work as a programmer) or your -school, if any, to sign a "copyright disclaimer" for the program, if -necessary. Here is a sample; alter the names: - -Yoyodyne, Inc., hereby disclaims all copyright interest in the program -'Gnomovision' (which makes passes at compilers) written by James Hacker. - -signature of Ty Coon, 1 April 1989 -Ty Coon, President of Vice - -This General Public License does not permit incorporating your program into -proprietary programs. If your program is a subroutine library, you may -consider it more useful to permit linking proprietary applications with the -library. If this is what you want to do, use the GNU Library General Public -License instead of this License. diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/Makefile b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/Makefile deleted file mode 100644 index 53814a4b8e41..000000000000 --- a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/Makefile +++ /dev/null @@ -1,3 +0,0 @@ -obj-$(CONFIG_HYPERV_INFINIBAND_ND) += hv_network_direct_144.o - -hv_network_direct_144-y := provider.o vmbus_rdma.o hvnd_addr.o diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/hvnd_addr.c b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/hvnd_addr.c deleted file mode 100644 index c8c2295b63af..000000000000 --- a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/hvnd_addr.c +++ /dev/null @@ -1,136 +0,0 @@ -/* - * Copyright (c) 2014, Microsoft Corporation. - * - * Author: - * K. Y. Srinivasan <kys@microsoft.com> - * - * This program is free software; you can redistribute it and/or modify it - * under the terms of the GNU General Public License version 2 as published - * by the Free Software Foundation. - * - * This program is distributed in the hope that it will be useful, but - * WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or - * NON INFRINGEMENT. See the GNU General Public License for more - * details. - * - * Bug fixes/enhancements: Long Li <longli@microsoft.com> - */ - -#include <linux/completion.h> -#include <linux/module.h> -#include <linux/errno.h> -#include <linux/hyperv.h> -#include <linux/efi.h> -#include <linux/slab.h> -#include <linux/cred.h> -#include <linux/sched.h> -#include <linux/types.h> -#include <linux/completion.h> -#include <linux/scatterlist.h> -#include <rdma/ib_umem.h> -#include <rdma/ib_addr.h> - -#include "vmbus_rdma.h" - - -#include <linux/semaphore.h> -#include <linux/fs.h> -#include <linux/nls.h> -#include <linux/workqueue.h> -#include <linux/cdev.h> -#include <linux/hyperv.h> -#include <linux/sched.h> -#include <linux/uaccess.h> -#include <linux/miscdevice.h> -#include <linux/hyperv.h> - - -int hvnd_get_outgoing_rdma_addr(struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx, - union nd_sockaddr_inet *og_addr) -{ - int ret; - /* - * Query the host and select the first address. - */ - struct pkt_query_addr_list pkt; - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - - hvnd_init_hdr(&pkt.hdr, - (sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1)), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST, 0, 0, 0); - - pkt.ioctl.in.version = ND_VERSION_1; - pkt.ioctl.in.reserved = 0; - pkt.ioctl.in.handle = uctx->adaptor_hdl; - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, - sizeof(pkt), (u64)&pkt); - - if (ret) - return ret; - - /* - * Copy the address out. - */ - - memcpy(og_addr, &pkt.ioctl.out[0], sizeof(*og_addr)); - return 0; - -} - -static struct rdma_addr_client self; - -struct resolve_cb_context { - struct rdma_dev_addr *addr; - struct completion comp; -}; - -void hvnd_addr_init(void) -{ - rdma_addr_register_client(&self); -} - -void hvnd_addr_deinit(void) -{ - rdma_addr_unregister_client(&self); -} - -static void resolve_cb(int status, struct sockaddr *src_addr, - struct rdma_dev_addr *addr, void *context) -{ - memcpy(((struct resolve_cb_context *)context)->addr, addr, sizeof(struct - rdma_dev_addr)); - complete(&((struct resolve_cb_context *)context)->comp); -} - -int hvnd_get_neigh_mac_addr(struct sockaddr *local, struct sockaddr *remote, - char *mac_addr) -{ - struct rdma_dev_addr dev_addr; - struct resolve_cb_context ctx; - int ret; - - memset(&dev_addr, 0, sizeof(dev_addr)); - dev_addr.net = &init_net; - ctx.addr = &dev_addr; - init_completion(&ctx.comp); - - ret = rdma_resolve_ip(&self, local, remote, &dev_addr, 1000, - resolve_cb, &ctx); - - if (ret) { - hvnd_error("rdma_resolve_ip failed ret=%d\n", ret); - return ret; - } - - wait_for_completion(&ctx.comp); - memcpy(mac_addr, dev_addr.dst_dev_addr, ETH_ALEN); - return ret; -} diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/mx_abi.h b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/mx_abi.h deleted file mode 100644 index 8f55cb408de0..000000000000 --- a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/mx_abi.h +++ /dev/null @@ -1,232 +0,0 @@ -/* - * Copyright (c) 2005 Topspin Communications. All rights reserved. - * Copyright (c) 2005 Cisco Systems. All rights reserved. - * Copyright (c) 2005 PathScale, Inc. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - * - * KYS: made some modifications. - */ - -#ifndef MX_ABI_H -#define MX_ABI_H - - - -/* - * Make sure that all structs defined in this file remain laid out so - * that they pack the same way on 32-bit and 64-bit architectures (to - * avoid incompatibility between 32-bit userspace and 64-bit kernels). - * Specifically: - * - Do not use pointer types -- pass pointers in UINT64 instead. - * - Make sure that any structure larger than 4 bytes is padded to a - * multiple of 8 bytes. Otherwise the structure size will be - * different between 32-bit and 64-bit architectures. - */ - -enum ibv_get_context_mappings { - IBV_GET_CONTEXT_UAR, - IBV_GET_CONTEXT_BF, - IBV_GET_CONTEXT_MAPPING_MAX -}; - -struct ibv_get_context_req { - - union nd_mapping mappings[IBV_GET_CONTEXT_MAPPING_MAX]; -}; - -struct ibv_get_context_resp { - - // mmap UAR and BF - struct nd_mapping_result mapping_results[IBV_GET_CONTEXT_MAPPING_MAX]; - - // mmap Blue Flame - int bf_buf_size; - int bf_offset; - - // mlx4_query_device result - int max_qp_wr; - int max_sge; - int max_cqe; - - // general parameters - u32 cqe_size; - u32 vend_id; - u16 dev_id; - u16 bf_reg_size; - u16 bf_regs_per_page; - u16 reserved1; - - // ibv_cmd_get_context result - u32 qp_tab_size; - - u32 reserved2; -}; - -struct ibv_alloc_pd_resp { - u64 pd_handle; - u32 pdn; - u32 reserved; -}; - -struct ibv_reg_mr { - u64 start; - u64 length; - u64 hca_va; - u32 access_flags; - u32 pdn; - u64 pd_handle; -}; - -struct ibv_reg_mr_resp { - u64 mr_handle; - u32 lkey; - u32 rkey; -}; - - -enum mlx4_ib_create_cq_mapping { - MLX4_IB_CREATE_CQ_BUF, - MLX4_IB_CREATE_CQ_DB, - MLX4_IB_CREATE_CQ_ARM_SN, // Windows specific - MLX4_IB_CREATE_CQ_MAPPING_MAX -}; - -#define MLX4_CQ_FLAGS_ARM_IN_KERNEL 1 - -struct ibv_create_cq { - union nd_mapping mappings[MLX4_IB_CREATE_CQ_MAPPING_MAX]; - u32 flags; -}; - -struct ibv_create_cq_resp { - struct nd_mapping_result mapping_results[MLX4_IB_CREATE_CQ_MAPPING_MAX]; - u32 cqn; - u32 cqe; -}; - -enum mlx4_ib_create_srq_mappings { - MLX4_IB_CREATE_SRQ_BUF, - MLX4_IB_CREATE_SRQ_DB, - MLX4_IB_CREATE_SRQ_MAPPINGS_MAX -}; - -struct ibv_create_srq { - union nd_mapping mappings[MLX4_IB_CREATE_SRQ_MAPPINGS_MAX]; -}; - -struct ibv_create_srq_resp { - struct nd_mapping_result mapping_results[MLX4_IB_CREATE_SRQ_MAPPINGS_MAX]; -}; - -enum mlx4_ib_create_qp_mappings { - MLX4_IB_CREATE_QP_BUF, - MLX4_IB_CREATE_QP_DB, - MLX4_IB_CREATE_QP_MAPPINGS_MAX -}; - -struct ibv_create_qp { - union nd_mapping mappings[MLX4_IB_CREATE_QP_MAPPINGS_MAX]; - u8 log_sq_bb_count; - u8 log_sq_stride; - u8 sq_no_prefetch; - u8 reserved; -}; - -struct ibv_create_qp_resp { - struct nd_mapping_result mapping_results[MLX4_IB_CREATE_QP_MAPPINGS_MAX]; - // struct ib_uverbs_create_qp_resp - u64 qp_handle; - u32 qpn; - u32 max_send_wr; - u32 max_recv_wr; - u32 max_send_sge; - u32 max_recv_sge; - u32 max_inline_data; -}; - -enum ibv_qp_attr_mask { - IBV_QP_STATE = 1 << 0, - IBV_QP_CUR_STATE = 1 << 1, - IBV_QP_EN_SQD_ASYNC_NOTIFY = 1 << 2, - IBV_QP_ACCESS_FLAGS = 1 << 3, - IBV_QP_PKEY_INDEX = 1 << 4, - IBV_QP_PORT = 1 << 5, - IBV_QP_QKEY = 1 << 6, - IBV_QP_AV = 1 << 7, - IBV_QP_PATH_MTU = 1 << 8, - IBV_QP_TIMEOUT = 1 << 9, - IBV_QP_RETRY_CNT = 1 << 10, - IBV_QP_RNR_RETRY = 1 << 11, - IBV_QP_RQ_PSN = 1 << 12, - IBV_QP_MAX_QP_RD_ATOMIC = 1 << 13, - IBV_QP_ALT_PATH = 1 << 14, - IBV_QP_MIN_RNR_TIMER = 1 << 15, - IBV_QP_SQ_PSN = 1 << 16, - IBV_QP_MAX_DEST_RD_ATOMIC = 1 << 17, - IBV_QP_PATH_MIG_STATE = 1 << 18, - IBV_QP_CAP = 1 << 19, - IBV_QP_DEST_QPN = 1 << 20 -}; - -enum ibv_qp_state { - IBV_QPS_RESET, - IBV_QPS_INIT, - IBV_QPS_RTR, - IBV_QPS_RTS, - IBV_QPS_SQD, - IBV_QPS_SQE, - IBV_QPS_ERR -}; - - -struct ibv_modify_qp_resp { - enum ibv_qp_attr_mask attr_mask; - u8 qp_state; - u8 reserved[3]; -}; - -struct ibv_create_ah_resp { - u64 start; -}; - -/* - * Some mlx4 specific kernel definitions. Perhaps could be in - * separate file. - */ - -struct mlx4_ib_user_db_page { - struct list_head list; - struct ib_umem *umem; - unsigned long user_virt; - int refcnt; -}; - - -#endif /* MX_ABI_H */ diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/provider.c b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/provider.c deleted file mode 100644 index 7115cf769fe0..000000000000 --- a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/provider.c +++ /dev/null @@ -1,2961 +0,0 @@ -/* - * Copyright (c) 2014, Microsoft Corporation. - * - * Author: - * K. Y. Srinivasan <kys@microsoft.com> - * - * This program is free software; you can redistribute it and/or modify it - * under the terms of the GNU General Public License version 2 as published - * by the Free Software Foundation. - * - * This program is distributed in the hope that it will be useful, but - * WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or - * NON INFRINGEMENT. See the GNU General Public License for more - * details. - * - * Bug fixes/enhancements: Long Li <longli@microsoft.com> - */ - -#include <linux/module.h> -#include <linux/moduleparam.h> -#include <linux/device.h> -#include <linux/netdevice.h> -#include <linux/etherdevice.h> -#include <linux/delay.h> -#include <linux/errno.h> -#include <linux/list.h> -#include <linux/spinlock.h> -#include <linux/ethtool.h> -#include <linux/rtnetlink.h> -#include <linux/inetdevice.h> -#include <linux/io.h> -#include <linux/hyperv.h> -#include <linux/completion.h> -#include <rdma/iw_cm.h> -#include <rdma/ib_verbs.h> -#include <rdma/ib_smi.h> -#include <rdma/ib_umem.h> -#include <rdma/ib_user_verbs.h> -#include <rdma/rdma_cm.h> -#include <linux/proc_fs.h> -#include <linux/seq_file.h> -#include "vmbus_rdma.h" - -/* - * We are emulating mlx4. XXXKYS: May have to FIX. - */ -#include "./user.h" - -static struct hvnd_dev *g_nd_dev; // the one and only one - -int hvnd_log_level = HVND_ERROR; -module_param(hvnd_log_level, int, 0644); -MODULE_PARM_DESC(hvnd_log_level, - "Logging level, 0 - Error (default), 1 - Warning, 2 - Info, 3 - Debug."); - -static int disable_cq_notify = 1; -//static int disable_cq_notify = 0; -module_param(disable_cq_notify, int, 0644); -MODULE_PARM_DESC(disable_cq_notify, - "Disable CQ notification, 0 - Enable, 1 - Disable (default)."); - -enum { - MLX4_USER_DEV_CAP_64B_CQE = 1L << 0 -}; - -#define HVND_NODE_DESC "vmbus-RDMA" - -#undef MLX4_IB_UVERBS_ABI_VERSION -#define MLX4_IB_UVERBS_ABI_VERSION 4 - -struct mlx4_wqe_data_seg { - __be32 byte_count; - __be32 lkey; - __be64 addr; -}; - -/* return value: - * true: ep is running - * false: ep is stopped - */ -bool ep_add_work_pending(struct hvnd_ep_obj *ep_object) -{ - bool ret = true; - - atomic_inc(&ep_object->nr_requests_pending); - if (ep_object->stopping) { - if (atomic_dec_and_test(&ep_object->nr_requests_pending)) - wake_up(&ep_object->wait_pending); - ret = false; - } - return ret; -} - -void ep_del_work_pending(struct hvnd_ep_obj *ep_object) -{ - if (atomic_dec_and_test(&ep_object->nr_requests_pending)) - wake_up(&ep_object->wait_pending); - - if (atomic_read(&ep_object->nr_requests_pending) < 0) { - hvnd_error("ep_object->nr_requests_pending=%d type=%d cm_state=%d\n", - atomic_read(&ep_object->nr_requests_pending), - ep_object->type, ep_object->cm_state); - dump_stack(); - } -} - -void ep_stop(struct hvnd_ep_obj *ep_object) -{ - if (!ep_object->stopping) { - ep_object->stopping = true; - hvnd_cancel_io(ep_object); - } - - if (atomic_read(&ep_object->nr_requests_pending) < 0) { - hvnd_error("IO canceled, ep_object->nr_requests_pending=%d type=%d cm_state=%d\n", - atomic_read(&ep_object->nr_requests_pending), - ep_object->type, ep_object->cm_state); - dump_stack(); - } - - wait_event(ep_object->wait_pending, - !atomic_read(&ep_object->nr_requests_pending)); -} - -static int vmbus_dma_map_sg(struct device *dev, struct scatterlist *sgl, - int nents, enum dma_data_direction direction, - unsigned long attrs) -{ - struct scatterlist *sg; - u64 addr; - int i; - int ret = nents; - - BUG_ON(!valid_dma_direction(direction)); - - for_each_sg(sgl, sg, nents, i) { - addr = (u64) page_address(sg_page(sg)); - /* TODO: handle highmem pages */ - if (!addr) { - ret = 0; - break; - } - sg->dma_address = addr + sg->offset; - sg->dma_length = sg->length; - } - return ret; -} - -static void vmbus_dma_unmap_sg(struct device *dev, - struct scatterlist *sg, int nents, - enum dma_data_direction direction, unsigned long attrs) -{ - BUG_ON(!valid_dma_direction(direction)); -} - - -const struct dma_map_ops vmbus_dma_ops = { - .map_sg = vmbus_dma_map_sg, - .unmap_sg = vmbus_dma_unmap_sg, -}; - -static int hvnd_get_incoming_connections(struct hvnd_ep_obj *listener, - struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx); - -static struct hvnd_ep_obj *hvnd_setup_ep(struct iw_cm_id *cm_id, int ep_type, - struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx); - -static void hvnd_deinit_ep(struct hvnd_ep_obj *ep) -{ - put_irp_handle(ep->nd_dev, ep->local_irp); -} - -static void hvnd_destroy_ep(struct hvnd_ep_obj *ep) -{ - hvnd_debug("canceling work for ep %p\n", ep); - cancel_work_sync(&ep->wrk.work); - hvnd_deinit_ep(ep); - kfree(ep); -} - - -#define UC(b) (((int)b)&0xff) -char *debug_inet_ntoa(struct in_addr in, char *b) -{ - register char *p; - - p = (char *)∈ - (void)snprintf(b, 20, - "%d.%d.%d.%d", UC(p[0]), UC(p[1]), UC(p[2]), UC(p[3])); - return b; -} - -void hvnd_process_events(struct work_struct *work); - -static int hvnd_init_ep(struct hvnd_ep_obj *ep_object, - struct iw_cm_id *cm_id, int ep_type, - struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx) -{ - int ret; - - ep_object->type = ep_type; - ep_object->cm_id = cm_id; - ep_object->nd_dev = nd_dev; - ep_object->uctx = uctx; - - ep_object->parent = NULL; - - ep_object->wrk.callback_arg = ep_object; - INIT_WORK(&ep_object->wrk.work, hvnd_process_events); - INIT_LIST_HEAD(&ep_object->incoming_pkt_list); - spin_lock_init(&ep_object->incoming_pkt_list_lock); - - ep_object->stopping = false; - atomic_set(&ep_object->nr_requests_pending, 0); - init_waitqueue_head(&ep_object->wait_pending); - - ret = get_irp_handle(nd_dev, &ep_object->local_irp, (void *)ep_object); - - if (ret) { - hvnd_error("get_irp_handle() failed: err: %d\n", ret); - return ret; - } - return 0; - -} - -static int set_rq_size(struct hvnd_dev *dev, struct ib_qp_cap *cap, - struct hvnd_qp *qp) -{ - - /* HW requires >= 1 RQ entry with >= 1 gather entry */ - if (!cap->max_recv_wr || !cap->max_recv_sge) - return -EINVAL; - - qp->rq_wqe_cnt = roundup_pow_of_two(max(1U, cap->max_recv_wr)); - qp->rq_max_gs = roundup_pow_of_two(max(1U, cap->max_recv_sge)); - qp->rq_wqe_shift = ilog2( - qp->rq_max_gs * sizeof(struct mlx4_wqe_data_seg)); - - - return 0; -} - -static int set_user_sq_size(struct hvnd_dev *dev, - struct hvnd_qp *qp, - struct mlx4_ib_create_qp *ucmd) -{ - qp->sq_wqe_cnt = 1 << ucmd->log_sq_bb_count; - qp->sq_wqe_shift = ucmd->log_sq_stride; - - qp->buf_size = (qp->rq_wqe_cnt << qp->rq_wqe_shift) + - (qp->sq_wqe_cnt << qp->sq_wqe_shift); - - return 0; -} - -static int hvnd_db_map_user(struct hvnd_ucontext *uctx, unsigned long virt, - struct ib_umem **db_umem) -{ - struct mlx4_ib_user_db_page *page; - int err = 0; - - mutex_lock(&uctx->db_page_mutex); - - list_for_each_entry(page, &uctx->db_page_list, list) - if (page->user_virt == (virt & PAGE_MASK)) - goto found; - - page = kmalloc(sizeof(*page), GFP_KERNEL); - if (!page) { - err = -ENOMEM; - goto out; - } - - page->user_virt = (virt & PAGE_MASK); - page->refcnt = 0; - page->umem = ib_umem_get(&uctx->ibucontext, virt & PAGE_MASK, - PAGE_SIZE, 0, 0); - if (IS_ERR(page->umem)) { - hvnd_error("ib_umem_get failure\n"); - err = PTR_ERR(page->umem); - kfree(page); - goto out; - } - - list_add(&page->list, &uctx->db_page_list); - -found: - ++page->refcnt; -out: - mutex_unlock(&uctx->db_page_mutex); - if (!err) - *db_umem = page->umem; - - return err; -} - -static void hvnd_db_unmap_user(struct hvnd_ucontext *uctx, u64 db_addr) -{ - struct mlx4_ib_user_db_page *page; - - mutex_lock(&uctx->db_page_mutex); - list_for_each_entry(page, &uctx->db_page_list, list) - if (page->user_virt == (db_addr & PAGE_MASK)) - goto found; - -found: - if (!--page->refcnt) { - list_del(&page->list); - ib_umem_release(page->umem); - kfree(page); - } - - mutex_unlock(&uctx->db_page_mutex); -} - - -static void debug_check(const char *func, int line) -{ - hvnd_debug("func is: %s; line is %d\n", func, line); - - if (in_interrupt()) { - hvnd_error("In interrupt func is: %s; line is %d\n", - func, line); - return; - } -} - -static struct ib_ah *hvnd_ah_create(struct ib_pd *pd, - struct ib_ah_attr *ah_attr, - struct ib_udata *udata) -{ - debug_check(__func__, __LINE__); - return ERR_PTR(-ENOSYS); -} - -static int hvnd_ah_destroy(struct ib_ah *ah) -{ - debug_check(__func__, __LINE__); - return -ENOSYS; -} - -static int hvnd_multicast_attach(struct ib_qp *ibqp, union ib_gid *gid, u16 lid) -{ - debug_check(__func__, __LINE__); - return -ENOSYS; -} - -static int hvnd_multicast_detach(struct ib_qp *ibqp, union ib_gid *gid, u16 lid) -{ - debug_check(__func__, __LINE__); - return -ENOSYS; -} - -static int hvnd_process_mad(struct ib_device *ibdev, - int mad_flags, - u8 port_num, - const struct ib_wc *in_wc, - const struct ib_grh *in_grh, - const struct ib_mad_hdr *in_mad, - size_t in_mad_size, - struct ib_mad_hdr *out_mad, - size_t *out_mad_size, - u16 *out_mad_pkey_index) -{ - debug_check(__func__, __LINE__); - return -ENOSYS; -} - -void hvnd_acquire_uctx_ref(struct hvnd_ucontext *uctx) -{ - atomic_inc(&uctx->refcnt); -} - -void hvnd_drop_uctx_ref(struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx) -{ - if (atomic_dec_and_test(&uctx->refcnt)) { - hvnd_debug("uctx ref cnt dropped it is %d\n", - atomic_read(&uctx->refcnt)); - hvnd_debug("About to close adaptor\n"); - hvnd_close_adaptor(nd_dev, uctx); - } else - hvnd_debug("uctx ref cnt dropped it is %d\n", - atomic_read(&uctx->refcnt)); -} - - -static int hvnd_dealloc_ucontext(struct ib_ucontext *context) -{ - struct hvnd_dev *nd_dev; - struct hvnd_ucontext *uctx; - - uctx = to_nd_context(context); - nd_dev = to_nd_dev(context->device); - - hvnd_debug("calling %s\n", __func__); - - hvnd_drop_uctx_ref(nd_dev, uctx); - - return 0; -} - -static struct ib_ucontext *hvnd_alloc_ucontext(struct ib_device *ibdev, - struct ib_udata *udata) -{ - struct hvnd_dev *nd_dev = to_nd_dev(ibdev); - struct hvnd_ucontext *uctx; - struct mlx4_ib_alloc_ucontext_resp resp; - struct ibv_get_context_resp *ctx_output; - int ret; - - if (!nd_dev->ib_active) { - hvnd_error("ib device is not active, try again\n"); - return ERR_PTR(-EAGAIN); - } - - uctx = get_uctx(nd_dev, current_pid()); - if (uctx) { - // it is already opened, just increase its reference count - hvnd_acquire_uctx_ref(uctx); - } else { - - /* - * The Windows host expects the following to be done: - * 1. Successfully send struct ndv_pkt_hdr_create_1 - * 2. INIT PROVIDER - * 3. Open Adapter - * Before we can complete this call. - */ - - uctx = kzalloc(sizeof(struct hvnd_ucontext), GFP_KERNEL); - if (!uctx) - return ERR_PTR(-ENOMEM); - - atomic_set(&uctx->refcnt, 1); - INIT_LIST_HEAD(&uctx->db_page_list); - mutex_init(&uctx->db_page_mutex); - - /* - * Stash away the context with the calling PID. - */ - ret = insert_handle(nd_dev, &nd_dev->uctxidr, uctx, - current_pid()); - if (ret) { - hvnd_error("Uctx ID insertion failed; ret is %d\n", - ret); - goto err1; - } - - hvnd_debug("Opening adaptor pid is %d\n", current_pid()); - - ret = hvnd_open_adaptor(nd_dev, uctx); - if (ret) { - hvnd_error("hvnd_open_adaptor failed ret=%d\n", ret); - goto err1; - } - - } - - /* - * Copy the response out. - */ - ctx_output = &uctx->o_adap_pkt.mappings.ctx_output; - resp.dev_caps = MLX4_USER_DEV_CAP_64B_CQE; - resp.qp_tab_size = ctx_output->qp_tab_size; - resp.bf_reg_size = ctx_output->bf_reg_size; - resp.bf_regs_per_page = ctx_output->bf_regs_per_page; - resp.cqe_size = ctx_output->cqe_size; - - ret = ib_copy_to_udata(udata, &resp, sizeof(resp)); - if (ret) { - hvnd_error("ib_copy_to_udata failed ret=%d\n", ret); - goto err1; - } - - return &uctx->ibucontext; - -err1: - kfree(uctx); - return ERR_PTR(ret); -} - -static int hvnd_mmap(struct ib_ucontext *context, struct vm_area_struct *vma) -{ - struct hvnd_ucontext *uctx = to_nd_context(context); - - if (vma->vm_end - vma->vm_start != PAGE_SIZE) { - hvnd_error("vma not a page size, actual size=%lu\n", - vma->vm_end - vma->vm_start); - return -EINVAL; - } - - if (vma->vm_pgoff == 0) { - vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); - - if (io_remap_pfn_range(vma, vma->vm_start, - (uctx->uar_base >> PAGE_SHIFT), - PAGE_SIZE, vma->vm_page_prot)) { - hvnd_error("io_remap_pfn_range failure\n"); - return -EAGAIN; - } - } else if (vma->vm_pgoff == 1 && uctx->bf_buf_size != 0) { - vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot); - - if (io_remap_pfn_range(vma, vma->vm_start, - (uctx->uar_base >> PAGE_SHIFT) + 1, - PAGE_SIZE, vma->vm_page_prot)) { - hvnd_error("io_remap_pfn_range failure\n"); - return -EAGAIN; - } - } else { - hvnd_error("check code\n"); - return -EINVAL; - } - - return 0; -} - -static int hvnd_deallocate_pd(struct ib_pd *pd) -{ - - struct hvnd_ucontext *uctx; - struct hvnd_dev *nd_dev; - struct hvnd_ib_pd *hvnd_pd; - struct ib_ucontext *ibuctx = pd->uobject->context; - - hvnd_pd = to_nd_pd(pd); - nd_dev = to_nd_dev(pd->device); - uctx = to_nd_context(ibuctx); - - hvnd_free_handle(nd_dev, uctx, hvnd_pd->handle, - IOCTL_ND_PD_FREE); - - hvnd_drop_uctx_ref(nd_dev, uctx); - return 0; -} - -static struct ib_pd *hvnd_allocate_pd(struct ib_device *ibdev, - struct ib_ucontext *context, - struct ib_udata *udata) -{ - struct hvnd_ucontext *uctx; - struct hvnd_dev *nd_dev; - int ret; - struct hvnd_ib_pd *hvnd_pd; - - if (!context) { - hvnd_error("kernel mode context not supported\n"); - return ERR_PTR(-EINVAL); - } - - hvnd_pd = kzalloc(sizeof(struct hvnd_ib_pd), GFP_KERNEL); - - if (!hvnd_pd) - return ERR_PTR(-ENOMEM); - - uctx = to_nd_context(context); - nd_dev = to_nd_dev(ibdev); - - ret = hvnd_create_pd(uctx, nd_dev, hvnd_pd); - if (ret) { - hvnd_error("hvnd_create_pd failure ret=%d\n", ret); - goto error_cr_pd; - } - - if (context) { - if (ib_copy_to_udata(udata, &hvnd_pd->pdn, sizeof(__u32))) { - hvnd_error("ib_copy_to_udata failure\n"); - ret = -EFAULT; - goto error_fault; - } - } - - hvnd_acquire_uctx_ref(uctx); - return &hvnd_pd->ibpd; - -error_fault: - hvnd_free_handle(nd_dev, uctx, hvnd_pd->handle, - IOCTL_ND_PD_FREE); - -error_cr_pd: - kfree(hvnd_pd); - return ERR_PTR(ret); -} - -static int hvnd_query_pkey(struct ib_device *ibdev, u8 port, u16 index, - u16 *pkey) -{ - debug_check(__func__, __LINE__); - *pkey = 0; - return 0; -} - -static int hvnd_query_gid(struct ib_device *ibdev, u8 port, int index, - union ib_gid *gid) -{ - int ret; - struct hvnd_dev *nd_dev = to_nd_dev(ibdev); - - debug_check(__func__, __LINE__); - - ret = wait_for_completion_timeout(&nd_dev->addr_set, 60*HZ); - if (!ret) - return -ETIMEDOUT; - - memset(&(gid->raw[0]), 0, sizeof(gid->raw)); - memcpy(&(gid->raw[0]), nd_dev->mac_addr, 6); - return 0; -} - -static int hvnd_query_device(struct ib_device *ibdev, - struct ib_device_attr *props, - struct ib_udata *udata) -{ - struct hvnd_dev *nd_dev = to_nd_dev(ibdev); - struct adapter_info_v2 *adap_info; - - if (!nd_dev->query_pkt_set) { - hvnd_error("query packet not received yet\n"); - return 0; - } - - adap_info = &nd_dev->query_pkt.ioctl.ad_info; - - memset(props, 0, sizeof(*props)); - - /* - * Copy the relevant properties out. - */ - props->fw_ver = 0; - props->device_cap_flags = 0; - //props->device_cap_flags |= IB_DEVICE_BAD_PKEY_CNTR; - //props->device_cap_flags |= IB_DEVICE_BAD_QKEY_CNTR; - //props->device_cap_flags |= IB_DEVICE_XRC; - - props->vendor_id = 0x15b3; - props->vendor_part_id = adap_info->device_id; - - props->max_mr_size = ~0ull; - props->page_size_cap = PAGE_SIZE; - props->max_qp = 16384; - props->max_qp_wr = min(adap_info->max_recv_q_depth, - adap_info->max_initiator_q_depth); - - props->max_sge = min(adap_info->max_initiator_sge, - adap_info->max_recv_sge); - props->max_cq = 0x1FFFF; - props->max_cqe = adap_info->max_completion_q_depth; - props->max_mr = 16384; - props->max_pd = 16384; - - props->max_qp_rd_atom = adap_info->max_inbound_read_limit; - props->max_qp_init_rd_atom = adap_info->max_outbound_read_limit; - props->max_res_rd_atom = props->max_qp_rd_atom * props->max_qp; - props->max_srq = 16384; - props->max_srq_wr = adap_info->max_recv_q_depth; - props->max_srq_sge = adap_info->max_recv_sge; - - return 0; -} - -static int hvnd_query_port(struct ib_device *ibdev, u8 port, - struct ib_port_attr *props) -{ - memset(props, 0, sizeof(struct ib_port_attr)); - - props->max_mtu = IB_MTU_4096; - props->active_mtu = IB_MTU_4096; - - /* - * KYS: TBD need to base this on netdev. - */ - props->state = IB_PORT_ACTIVE; - - props->port_cap_flags = IB_PORT_CM_SUP; - - props->gid_tbl_len = 1; - props->pkey_tbl_len = 1; - props->active_width = 1; - props->active_speed = IB_SPEED_DDR; //KYS: check - props->max_msg_sz = -1; - - return 0; -} - -static enum rdma_link_layer -hvnd_get_link_layer(struct ib_device *device, u8 port) -{ - return IB_LINK_LAYER_ETHERNET; -} - -static ssize_t hvnd_show_rev(struct device *dev, struct device_attribute *attr, - char *buf) -{ - return 0; -} - -static ssize_t hvnd_show_fw_ver(struct device *dev, - struct device_attribute *attr, char *buf) -{ - return 0; -} - -static ssize_t hvnd_show_hca(struct device *dev, struct device_attribute *attr, - char *buf) -{ - return 0; -} - -static ssize_t hvnd_show_board(struct device *dev, - struct device_attribute *attr, char *buf) -{ - return 0; -} - -static int hvnd_get_port_immutable(struct ib_device *ibdev, u8 port_num, - struct ib_port_immutable *immutable) -{ - struct ib_port_attr attr; - int err; - - err = hvnd_query_port(ibdev, port_num, &attr); - if (err) - return err; - - immutable->pkey_tbl_len = attr.pkey_tbl_len; - immutable->gid_tbl_len = attr.gid_tbl_len; - immutable->core_cap_flags = RDMA_CORE_PORT_IWARP; - - return 0; -} - -static struct ib_qp *hvnd_ib_create_qp(struct ib_pd *pd, - struct ib_qp_init_attr *attrs, - struct ib_udata *udata) -{ - struct hvnd_ucontext *uctx; - struct hvnd_dev *nd_dev; - struct mlx4_ib_create_qp ucmd; - struct hvnd_qp *qp; - int ret = 0; - struct hvnd_ib_pd *hvnd_pd = to_nd_pd(pd); - struct hvnd_cq *send_cq = to_nd_cq(attrs->send_cq); - struct hvnd_cq *recv_cq = to_nd_cq(attrs->recv_cq); - - uctx = get_uctx_from_pd(pd); - nd_dev = to_nd_dev(pd->device); - - if (attrs->qp_type != IB_QPT_RC) { - hvnd_error("attrs->qp_type=%d not IB_QPT_RC\n", attrs->qp_type); - return ERR_PTR(-EINVAL); - } - - qp = kzalloc(sizeof(*qp), GFP_KERNEL); - if (!qp) { - ret = -ENOMEM; - goto err_done; - } - - qp->uctx = uctx; - - if (ib_copy_from_udata(&ucmd, udata, sizeof(ucmd))) { - hvnd_error("ib_copy_from_udata failed\n"); - ret = -EFAULT; - goto err_ucpy; - } - - qp->qp_buf = (void *)ucmd.buf_addr; - qp->db_addr = (void *)ucmd.db_addr; - qp->log_sq_bb_count = ucmd.log_sq_bb_count; - qp->log_sq_stride = ucmd.log_sq_stride; - qp->sq_no_prefetch = ucmd.sq_no_prefetch; - qp->port = attrs->port_num; - - init_waitqueue_head(&qp->wait); - atomic_set(&qp->refcnt, 1); - - qp->recv_cq = recv_cq; - qp->send_cq = send_cq; - qp->nd_dev = nd_dev; - - qp->receive_cq_handle = recv_cq->cq_handle; - qp->initiator_cq_handle = send_cq->cq_handle; - qp->pd_handle = hvnd_pd->handle; - qp->cq_notify = false; - - qp->ibqp.qp_num = attrs->qp_type == IB_QPT_SMI ? 0 : 1; - - qp->max_inline_data = attrs->cap.max_inline_data; - - qp->initiator_q_depth = attrs->cap.max_send_wr; - qp->initiator_request_sge = attrs->cap.max_send_sge; - - - qp->receive_q_depth = attrs->cap.max_recv_wr; - qp->receive_request_sge = attrs->cap.max_recv_sge; - - set_rq_size(nd_dev, &attrs->cap, qp); - - set_user_sq_size(nd_dev, qp, &ucmd); - - qp->umem = ib_umem_get(&uctx->ibucontext, ucmd.buf_addr, - qp->buf_size, 0, 0); - if (IS_ERR(qp->umem)) { - ret = PTR_ERR(qp->umem); - hvnd_error("ib_umem_get failed ret=%d\n", ret); - goto err_ucpy; - } - - ret = hvnd_db_map_user(uctx, ucmd.db_addr, &qp->db_umem); - - if (ret) { - hvnd_error("hvnd_db_map_user failed ret=%d\n", ret); - goto err_db_map; - } - - ret = hvnd_create_qp(nd_dev, uctx, qp); - - if (ret) { - hvnd_error("hvnd_create_qp failed ret=%d\n", ret); - goto err_qp; - } - - hvnd_acquire_uctx_ref(uctx); - - qp->ibqp.qp_num = qp->qpn; - qp->ibqp.qp_type = IB_QPT_RC; - - - return &qp->ibqp; - -err_qp: - hvnd_db_unmap_user(uctx, ucmd.db_addr); - -err_db_map: - ib_umem_release(qp->umem); - -err_ucpy: - kfree(qp); -err_done: - return ERR_PTR(ret); -} - -static int hvnd_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, - int attr_mask, struct ib_udata *udata) -{ - struct hvnd_qp *qp = to_nd_qp(ibqp); - struct hvnd_dev *nd_dev = to_nd_dev(ibqp->device); - enum ib_qp_state cur_state, new_state; - int ret = 0; - - if (attr != NULL) { - cur_state = (attr_mask & IB_QP_CUR_STATE) - ? attr->cur_qp_state : qp->qp_state; - new_state = (attr_mask & IB_QP_STATE) - ? attr->qp_state : cur_state; - - hvnd_debug("qp->qp_state is %d new state is %d\n", - qp->qp_state, new_state); - hvnd_debug("current qp state is %d\n", cur_state); - if (attr_mask & IB_QP_STATE) { - /* Ensure the state is valid */ - if (attr->qp_state < 0 || attr->qp_state > IB_QPS_ERR) { - hvnd_error("incorrect qp state attr->qp_state=%d\n", - attr->qp_state); - return -EINVAL; - } - - if (qp->qp_state != new_state) { - qp->qp_state = new_state; - /* - * The only state transition supported is the - * transition to error state. - */ - switch (new_state) { - case IB_QPS_ERR: - case IB_QPS_SQD: - ret = hvnd_flush_qp(nd_dev, qp->uctx, - qp); - - if (ret) - hvnd_error("hvnd_flush_qp failed ret=%d\n", ret); - - // immediately notify the upper layer - // on disconnection - if (!ret && qp->connector) - hvnd_process_notify_disconnect( - qp->connector, - STATUS_SUCCESS); - - return ret; - - default: - break; - } - } - } - } - return 0; -} - - -static int hvnd_ib_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, - int attr_mask, struct ib_qp_init_attr *init_attr) -{ - struct hvnd_qp *qp = to_nd_qp(ibqp); - - memset(attr, 0, sizeof(*attr)); - memset(init_attr, 0, sizeof(*init_attr)); - - attr->qp_state = qp->qp_state; - - init_attr->cap.max_send_wr = qp->max_send_wr; - init_attr->cap.max_recv_wr = qp->max_recv_wr; - - init_attr->cap.max_send_sge = qp->max_send_sge; - init_attr->cap.max_recv_sge = qp->max_recv_sge; - init_attr->cap.max_inline_data = qp->max_inline_data; - - init_attr->sq_sig_type = IB_SIGNAL_ALL_WR; - - return 0; -} - -static void hvnd_refuse_connection(struct hvnd_ep_obj *connector, int status); -static int hvnd_destroy_qp(struct ib_qp *ib_qp) -{ - int ret; - struct hvnd_qp *qp = to_nd_qp(ib_qp); - struct hvnd_dev *nd_dev = to_nd_dev(ib_qp->device); - u64 jiffies; - - if (!qp->connector) { - hvnd_warn("error: connector is NULL; skip destroying connector\n"); - goto free_qp; - } - - /* - * should we flush the qp first on ctrl-C?, - * no need to disconnect on abrupt shutdown? - */ - if (qp->qp_state != IB_QPS_ERR && qp->qp_state != IB_QPS_SQD) { - hvnd_warn("qp_state=%d, doing abrupt disconnect\n", - qp->qp_state); - hvnd_flush_qp(nd_dev, qp->uctx, qp); - - ep_stop(qp->connector); - - // now no pending activity is possible on the connector - - switch (qp->connector->cm_state) { - - case hvnd_cm_idle: - case hvnd_cm_connect_reply_refused: - case hvnd_cm_connect_request_sent: - case hvnd_cm_close_sent: - hvnd_warn("cm_state = %d not doing anything\n", - qp->connector->cm_state); - break; - - case hvnd_cm_connect_received: - hvnd_warn("cm_state = %d refusing pending connection request\n", - qp->connector->cm_state); - hvnd_refuse_connection(qp->connector, -ECONNREFUSED); - break; - - case hvnd_cm_connect_reply_sent: - case hvnd_cm_established_sent: - case hvnd_cm_accept_sent: - hvnd_warn("cm_state = %d notifying disconnect on existing connection\n", - qp->connector->cm_state); - hvnd_process_notify_disconnect(qp->connector, - STATUS_CANCELLED); - break; - - default: - hvnd_error("unknown cm_state = %d\n", - qp->connector->cm_state); - - } - goto free_connector; - } else { - hvnd_debug("qp_state=%d, doing normal disconnect\n", - qp->qp_state); - } - - if (!ep_add_work_pending(qp->connector)) - goto free_connector; - - init_completion(&qp->connector->disconnect_event); - - /* - * First issue a disconnect on the connector. - */ - - hvnd_debug("calling hvnd_connector_disconnect\n"); - ret = hvnd_connector_disconnect(nd_dev, qp->uctx, - qp->connector->ep_handle, - qp->connector); - if (ret) { - ep_del_work_pending(qp->connector); - hvnd_error("disconnect: retval is %d\n", ret); - ep_stop(qp->connector); - goto free_connector; - } - /* - * Now wait for the disconnect. - */ - jiffies = get_jiffies_64(); - wait_for_completion(&qp->connector->disconnect_event); - hvnd_debug("Completed disconnect connector=%p jiffies=%llu\n", - qp->connector, get_jiffies_64() - jiffies); - - /* - * Now free up the connector and drop the reference on uctx. - */ - - ep_stop(qp->connector); - -free_connector: - hvnd_debug("destroying connector handle: %p\n", - (void *) qp->connector->ep_handle); - hvnd_free_handle(nd_dev, qp->uctx, - qp->connector->ep_handle, - IOCTL_ND_CONNECTOR_FREE); - - hvnd_drop_uctx_ref(nd_dev, qp->uctx); - hvnd_destroy_ep(qp->connector); - qp->connector = NULL; -free_qp: - atomic_dec(&qp->refcnt); - hvnd_debug("Waiting for the ref cnt to go to 0\n"); - - wait_event(qp->wait, !atomic_read(&qp->refcnt)); - - hvnd_debug("About to destroy qp\n"); - hvnd_db_unmap_user(qp->uctx, (u64)qp->db_addr); - ib_umem_release(qp->umem); - - hvnd_debug("About to free qp\n"); - ret = hvnd_free_qp(nd_dev, qp->uctx, qp); - - if (ret == 0) { - hvnd_drop_uctx_ref(nd_dev, qp->uctx); - kfree(qp); - } else { - hvnd_error("free qp failed: ret is %d\n", ret); - } - - return ret; -} - -static struct ib_cq *hvnd_ib_create_cq(struct ib_device *ibdev, - const struct ib_cq_init_attr *attr, - struct ib_ucontext *ib_context, - struct ib_udata *udata) -{ - struct hvnd_ucontext *uctx; - struct hvnd_dev *nd_dev; - struct mlx4_ib_create_cq ucmd; - struct hvnd_cq *cq; - int ret = 0; - int entries = attr->cqe; - - uctx = to_nd_context(ib_context); - nd_dev = to_nd_dev(ibdev); - - if (entries < 1 || entries > uctx->max_cqe) { - hvnd_error("incorrct entries=%d\n", entries); - ret = -EINVAL; - goto err_done; - } - - cq = kzalloc(sizeof(*cq), GFP_KERNEL); - if (!cq) { - ret = -ENOMEM; - goto err_done; - } - - entries = roundup_pow_of_two(entries + 1); - cq->ibcq.cqe = entries - 1; - cq->entries = entries; - cq->uctx = uctx; - - if (ib_copy_from_udata(&ucmd, udata, sizeof(ucmd))) { - hvnd_error("ib_copy_from_udata failed\n"); - ret = -EFAULT; - goto err_ucpy; - } - - cq->cq_buf = (void *)ucmd.buf_addr; - cq->db_addr = (void *)ucmd.db_addr; - cq->arm_sn = 0; - - /* - * Initialize the IRP state. Need to have a separate irp state - * for CQ; for now share it with Listener/connector. - */ - ret = hvnd_init_ep(&cq->ep_object, NULL, ND_CQ, nd_dev, uctx); - - if (ret) { - hvnd_error("hvnd_init_ep failed ret=%d\n", ret); - goto err_ucpy; - } - - cq->ep_object.cq = cq; - cq->monitor = true; - - cq->umem = ib_umem_get(ib_context, ucmd.buf_addr, - (entries * uctx->cqe_size), - IB_ACCESS_LOCAL_WRITE, 1); - if (IS_ERR(cq->umem)) { - ret = IS_ERR(cq->umem); - hvnd_error("ib_umem_get failed ret=%d\n", ret); - goto err_ucpy; - } - - ret = hvnd_db_map_user(uctx, ucmd.db_addr, &cq->db_umem); - - if (ret) { - hvnd_error("hvnd_db_map_user failed ret=%d\n", ret); - goto err_db_map; - } - - ret = hvnd_create_cq(nd_dev, uctx, cq); - - if (ret) { - hvnd_error("hvnd_create_cq failed ret=%d\n", ret); - goto err_cq; - } - - cq->ep_object.ep_handle = cq->cq_handle; - - if (ib_copy_to_udata(udata, &cq->cqn, sizeof(__u32))) { - hvnd_error("ib_copy_to_udata failed\n"); - ret = -EFAULT; - goto err_ucpy_out; - } - - if (!disable_cq_notify) { - - if (!ep_add_work_pending(&cq->ep_object)) - goto err_ucpy_out; - - ret = hvnd_notify_cq(nd_dev, cq, ND_CQ_NOTIFY_ANY, - (u64)&cq->ep_object); - - if (ret) { - ep_del_work_pending(&cq->ep_object); - hvnd_error("hvnd_notify_cq failed ret=%d\n", ret); - goto err_ucpy_out; - } - } - - hvnd_acquire_uctx_ref(uctx); - - return &cq->ibcq; - -err_ucpy_out: - hvnd_destroy_cq(nd_dev, cq); - -err_cq: - hvnd_db_unmap_user(uctx, ucmd.db_addr); - -err_db_map: - ib_umem_release(cq->umem); - -err_ucpy: - kfree(cq); -err_done: - return ERR_PTR(ret); -} - -static struct ib_qp *hvnd_get_qp(struct ib_device *dev, int qpn) -{ - struct hvnd_dev *nd_dev; - struct hvnd_qp *qp = NULL; - - nd_dev = to_nd_dev(dev); - qp = get_qpp(nd_dev, qpn); - return qp ? &qp->ibqp : NULL; -} - -static int hvnd_ib_destroy_cq(struct ib_cq *ib_cq) -{ - struct hvnd_ucontext *uctx; - struct hvnd_dev *nd_dev; - struct hvnd_cq *cq; - - cq = to_nd_cq(ib_cq); - uctx = cq->uctx; - nd_dev = to_nd_dev(uctx->ibucontext.device); - - cq->monitor = false; - -// hvnd_cancel_io(&cq->ep_object); - ep_stop(&cq->ep_object); - - hvnd_deinit_ep(&cq->ep_object); - - hvnd_db_unmap_user(uctx, (u64)cq->db_addr); - ib_umem_release(cq->umem); - - hvnd_destroy_cq(nd_dev, cq); - - hvnd_drop_uctx_ref(nd_dev, uctx); - kfree(cq); - - return 0; -} - -static int hvnd_resize_cq(struct ib_cq *cq, int cqe, struct ib_udata *udata) -{ - /* - * NDDirect does not support resizing CQ. - */ - hvnd_info("check code\n"); - return -ENOSYS; -} - -static int hvnd_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc) -{ - hvnd_info("check code\n"); - return 0; -} - -static struct ib_mr *hvnd_get_dma_mr(struct ib_pd *pd, int acc) -{ - hvnd_info("check code\n"); - return NULL; -} - -static void debug_dump_umem(struct ib_umem *umem) -{ -#ifdef HVND_MEM_DEBUG - struct ib_umem_chunk *chunk; - struct scatterlist *sg; - int len, j, entry; - int shift = ffs(umem->page_size) - 1; - - hvnd_debug("umem=%p\n", umem); - hvnd_debug("context=%p length=%lu offset=%d page_size=%d writable=%d hugetlb=%d\n", - umem->context, - umem->length, - umem->offset, - umem->page_size, - umem->writable, - umem->hugetlb); - - list_for_each_entry(chunk, &umem->chunk_list, list) { - hvnd_debug("chunk->nmap=%d\n", chunk->nmap); - for (j = 0; j < chunk->nmap; ++j) { - sg = &chunk->page_list[j]; - hvnd_debug("sg_dma_len=%d sg_dma_address=%llx\n", - sg_dma_len(sg), sg_dma_address(sg)); - hvnd_debug("page_link=%lx offset=%u length=%u\n", - sg->page_link, sg->offset, sg->length); - len = sg_dma_len(&chunk->page_list[j]) >> shift; - for_each_sg(&chunk->page_list[j], sg, len, entry) { - hvnd_debug("PFN=%lu\n", - page_to_pfn(sg_page(sg))); - } - } - } -#endif -} - - -static struct ib_mr *hvnd_reg_user_mr(struct ib_pd *pd, u64 start, u64 length, - u64 virt, int acc, struct ib_udata *udata) -{ - int err = 0; - struct hvnd_ib_pd *hvndpd = to_nd_pd(pd); - struct hvnd_mr *mr; - - mr = kmalloc(sizeof(*mr), GFP_KERNEL); - if (!mr) - return ERR_PTR(-ENOMEM); - mr->pd = hvndpd; - - mr->umem = ib_umem_get(pd->uobject->context, start, length, acc, 0); - if (IS_ERR(mr->umem)) { - err = PTR_ERR(mr->umem); - hvnd_error("ib_umem_get failed ret=%d\n", err); - kfree(mr); - return ERR_PTR(err); - } - - debug_dump_umem(mr->umem); - - mr->start = start; - mr->length = length; - mr->virt = virt; - mr->acc = acc; - - hvnd_debug("start=%llx length=%llx virt=%llx acc=%d\n", - start, length, virt, acc); - - /* - * First create a memory region. - */ - err = hvnd_cr_mr(to_nd_dev(pd->device), - to_nd_context(pd->uobject->context), hvndpd->handle, - &mr->mr_handle); - if (err) { - hvnd_error("cr_mr failed; ret is %d\n", err); - goto err; - } - - err = hvnd_mr_register(to_nd_dev(pd->device), - to_nd_context(pd->uobject->context), mr); - - if (err) - goto err0; - - hvnd_acquire_uctx_ref(to_nd_context(pd->uobject->context)); - - return &mr->ibmr; - -err0: - hvnd_free_mr(to_nd_dev(pd->device), - to_nd_context(pd->uobject->context), mr->mr_handle); -err: - ib_umem_release(mr->umem); - kfree(mr); - return ERR_PTR(err); -} - - - -static int hvnd_dereg_mr(struct ib_mr *ib_mr) -{ - int ret; - struct hvnd_mr *mr = to_nd_mr(ib_mr); - struct hvnd_ucontext *uctx = to_nd_context(ib_mr->pd->uobject->context); - struct hvnd_dev *nd_dev = to_nd_dev(ib_mr->device); - - - hvnd_debug("dereg_mr entering\n"); - - ret = hvnd_deregister_mr(nd_dev, uctx, mr->mr_handle); - - if (ret) { - hvnd_error("hvnd_deregister_mr() failed: %x\n", ret); - return ret; - } - /* - * Now free up the memory region. - */ - - ret = hvnd_free_mr(nd_dev, uctx, mr->mr_handle); - if (ret) { - hvnd_error("hvnd_free_mr() failed: %x\n", ret); - return ret; - } - - ib_umem_release(mr->umem); - - hvnd_drop_uctx_ref(nd_dev, uctx); - kfree(mr); - - hvnd_debug("dereg_mr done\n"); - return 0; -} - -static struct ib_mw *hvnd_alloc_mw(struct ib_pd *pd, enum ib_mw_type type, - struct ib_udata *udata) -{ - hvnd_info("check code\n"); - return NULL; -} - -static int hvnd_dealloc_mw(struct ib_mw *mw) -{ - debug_check(__func__, __LINE__); - return 0; -} - -static int hvnd_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags) -{ - struct hvnd_ucontext *uctx; - struct hvnd_dev *nd_dev; - struct hvnd_cq *cq; - - cq = to_nd_cq(ibcq); - uctx = cq->uctx; - nd_dev = to_nd_dev(uctx->ibucontext.device); - - - debug_check(__func__, __LINE__); - - return 0; -} - -static int hvnd_post_send(struct ib_qp *ibqp, struct ib_send_wr *wr, - struct ib_send_wr **bad_wr) -{ - debug_check(__func__, __LINE__); - return 0; -} - -int hvnd_post_receive(struct ib_qp *ibqp, struct ib_recv_wr *wr, - struct ib_recv_wr **bad_wr) -{ - debug_check(__func__, __LINE__); - return 0; -} - -static int hvnd_resolve_addr(struct sockaddr_in *laddr, - struct sockaddr_in *raddr, - struct if_physical_addr *phys_addrstruct) -{ - int ret; - - phys_addrstruct->length = ETH_ALEN; - ret = hvnd_get_neigh_mac_addr((struct sockaddr *)laddr, - (struct sockaddr *)raddr, - phys_addrstruct->addr); - - hvnd_debug("Dest MAC is %pM\n", phys_addrstruct->addr); - return ret; -} - -static int hvnd_connect(struct iw_cm_id *cm_id, - struct iw_cm_conn_param *conn_param) -{ - int ret = 0; - struct hvnd_dev *nd_dev; - struct hvnd_ep_obj *ep_object; - struct sockaddr_in *raddr = (struct sockaddr_in *)&cm_id->remote_addr; - struct sockaddr_in *laddr = (struct sockaddr_in *)&cm_id->local_addr; - struct hvnd_qp *qp; - struct if_physical_addr phys_addrstruct; - union nd_sockaddr_inet dest_addr; - u64 connector_handle; - union nd_sockaddr_inet addr; - char addr_buf[50]; - - if (cm_id->remote_addr.ss_family != AF_INET) { - hvnd_error("cm_id->remote_addr.ss_family=%d not AF_INET\n", - cm_id->remote_addr.ss_family); - return -ENOSYS; - } - - qp = get_qpp(to_nd_dev(cm_id->device), conn_param->qpn); - - if (!qp) { - hvnd_error("failed to find qp conn_param->qpn=%d\n", - conn_param->qpn); - return -EINVAL; - } - - cm_id->provider_data = qp; - cm_id->add_ref(cm_id); - qp->cm_id = cm_id; - - /* - * Set the read/write limits. - * Can we change the limits on a created QP? Luke? - */ - nd_dev = to_nd_dev(cm_id->device); - ep_object = hvnd_setup_ep(cm_id, ND_CONNECTOR, nd_dev, qp->uctx); - hvnd_debug("active connection: local irp is %d\n", - ep_object->local_irp); - if (!ep_object) { - hvnd_error("hvnd_setup_ep failure\n"); - ret = -ENOMEM; - goto err_limit; - } - - ret = hvnd_cr_connector(nd_dev, qp->uctx, - &connector_handle); - - if (ret) { - hvnd_error("hvnd_cr_connector failure ret=%d\n", ret); - goto err_cr_connector; - } - - hvnd_acquire_uctx_ref(qp->uctx); - ep_object->ep_handle = connector_handle; - ep_object->incoming = false; - qp->connector = ep_object; - - /* - * Bind the local address to the connector. - */ - hvnd_debug("Connect local address is %s\n", - debug_inet_ntoa(laddr->sin_addr, addr_buf)); - - memcpy(&addr.ipv4, laddr, sizeof(struct sockaddr_in)); - hvnd_debug("CONNECT AF %d port %d addr %s\n", - addr.ipv4.sin_family, addr.ipv4.sin_port, - debug_inet_ntoa(addr.ipv4.sin_addr, addr_buf)); - - ret = hvnd_bind_connector(nd_dev, qp->uctx, - connector_handle, - &addr); - - if (ret) { - hvnd_error("hvnd_bind_connector failed ret=%d\n", ret); - goto err_bind_connector; - } - - ret = hvnd_resolve_addr(laddr, raddr, &phys_addrstruct); - if (ret) { - hvnd_error("hvnd_resolve_addr failed ret=%d\n", ret); - goto err_bind_connector; - } - - memcpy(&dest_addr.ipv4, raddr, sizeof(struct sockaddr_in)); - - - - /* - * Now attempt to connect. - */ - - hvnd_debug("About to initiate connection\n"); - - if (!ep_add_work_pending(ep_object)) - goto err_bind_connector; - - ep_object->cm_state = hvnd_cm_connect_received; - ret = hvnd_connector_connect(nd_dev, qp->uctx, - ep_object->ep_handle, - conn_param->ird, conn_param->ord, - conn_param->private_data_len, - (u8 *)conn_param->private_data, - qp->qp_handle, - &phys_addrstruct, &dest_addr, - ep_object); - - if (ret == 0) - return 0; - - ep_object->cm_state = hvnd_cm_idle; - ep_del_work_pending(ep_object); - hvnd_error("hvnd_connector_connect failed ret=%d\n", ret); - -err_bind_connector: - qp->connector = NULL; - hvnd_free_connector(nd_dev, qp->uctx, - connector_handle); - hvnd_drop_uctx_ref(nd_dev, qp->uctx); - -err_cr_connector: - kfree(ep_object); - -err_limit: - cm_id->provider_data = NULL; - qp->cm_id = NULL; - cm_id->rem_ref(cm_id); - return ret; -} - -static int hvnd_accept_cr(struct iw_cm_id *cm_id, - struct iw_cm_conn_param *conn_param) -{ - int ret = 0; - struct hvnd_dev *nd_dev; - struct hvnd_qp *qp; - struct hvnd_ep_obj *connector; - enum ibv_qp_state new_qp_state; - - hvnd_debug("Accepting connection - PASSIVE\n"); - nd_dev = to_nd_dev(cm_id->device); - qp = get_qpp(to_nd_dev(cm_id->device), conn_param->qpn); - - if (!qp) { - hvnd_error("get_qpp failed conn_param->qpn=%d\n", - conn_param->qpn); - return -EINVAL; - } - - - connector = (struct hvnd_ep_obj *)cm_id->provider_data; - qp->connector = connector; - - if (connector == NULL) { - hvnd_error("NULL connector!\n"); - return -EINVAL; - } - hvnd_debug("connector's cm_id is %p caller cm_id=%p\n", - connector->cm_id, cm_id); - - connector->cq = qp->recv_cq; - - - /* - * Setup state for the accepted connection. - */ - cm_id->add_ref(cm_id); - connector->cm_id = cm_id; - if (conn_param == NULL) { - hvnd_error("NULL conn_param!\n"); - return -EINVAL; - } - - connector->ord = conn_param->ord; - connector->ird = conn_param->ird; - - if (!ep_add_work_pending(connector)) - goto error; - - init_completion(&connector->connector_accept_event); - - ret = hvnd_connector_accept(nd_dev, qp->uctx, connector->ep_handle, - qp->qp_handle, conn_param->ird, - conn_param->ord, - conn_param->private_data_len, - conn_param->private_data, - &new_qp_state, connector); - - if (ret) { - ep_del_work_pending(connector); - hvnd_error("connector accept failed\n"); - goto error; - } - - wait_for_completion(&connector->connector_accept_event); - ret = connector->connector_accept_status; - - if (ret) { - hvnd_error("connector_accept failed status=%x\n", ret); - ret = -EIO; - goto error; - } - - hvnd_debug("Passive Connection Accepted; new qp state is %d\n", - new_qp_state); - connector->cm_state = hvnd_cm_accept_sent; - return 0; - -error: - ep_stop(connector); - connector->cm_id = NULL; - connector->cm_state = hvnd_cm_idle; - - qp->connector = NULL; - cm_id->rem_ref(cm_id); - - return ret; -} - -static int hvnd_reject_cr(struct iw_cm_id *cm_id, const void *pdata, - u8 pdata_len) -{ - debug_check(__func__, __LINE__); - return 0; -} - -void hvnd_process_disconnect(struct hvnd_ep_obj *ep_object, int status) -{ - struct iw_cm_event cm_event; - - switch (status) { - case STATUS_SUCCESS: - case STATUS_CANCELLED: - break; - - default: - hvnd_warn("disconnect complete failed: status:%d\n", status); - } - - - hvnd_debug("active disconnect processed\n"); - memset(&cm_event, 0, sizeof(cm_event)); - - complete(&ep_object->disconnect_event); -} - - -void hvnd_process_notify_disconnect(struct hvnd_ep_obj *ep_object, int status) -{ - struct iw_cm_event cm_event; - - // make sure we only disconnect once - if (atomic_xchg(&ep_object->disconnect_notified, 1)) - return; - - /* - * Turn off CQ monitoring. - */ - if (ep_object->cq) - ep_object->cq->monitor = false; - - switch (ep_object->cm_state) { - case hvnd_cm_connect_reply_sent: - case hvnd_cm_established_sent: - case hvnd_cm_accept_sent: - break; - - default: - hvnd_error("unexpected cm_state=%d\n", ep_object->cm_state); - return; - } - - switch (status) { - case STATUS_SUCCESS: - case STATUS_CANCELLED: - case STATUS_DISCONNECTED: - break; - - default: - hvnd_warn("notify disconnect complete failed: status:%d\n", - status); - } - - hvnd_debug("passive disconnect notified\n"); - memset(&cm_event, 0, sizeof(cm_event)); - - /* - * Other end disconnected. - * Connection has been disconnected; - * notify the cm layer. - */ - cm_event.status = -ECONNRESET; - cm_event.event = IW_CM_EVENT_CLOSE; - - if ((ep_object->cm_id) && - (ep_object->cm_id->event_handler)) { - - ep_object->cm_id->event_handler(ep_object->cm_id, &cm_event); - - ep_object->cm_id->rem_ref(ep_object->cm_id); - ep_object->cm_state = hvnd_cm_close_sent; - } -} - -void hvnd_process_connector_accept(struct hvnd_ep_obj *ep_object, int status) -{ - struct iw_cm_event cm_event; - int ret; - - /* - * This is the problem area the return status may be: - * 1: 0xc00000b5 (3221225653) - {Device Timeout} The specified I/O - * operation on %hs was not completed before the time-out period - * expired - * 2: NTSTATUS 0xc0000241 (3221226049) - The transport connection was - * aborted by the local system. - * If we do nothing here, iwcm will wait for IW_CM_EVENT_ESTABLISHED - * forever, and unable to clean shutdown need to fail the call earlier - * on accept. - */ - - ep_object->connector_accept_status = status; - - if (status) { - hvnd_error("Connector accept failed; status is %x\n", status); - complete(&ep_object->connector_accept_event); - return; - } - - memset(&cm_event, 0, sizeof(cm_event)); - cm_event.event = IW_CM_EVENT_ESTABLISHED; - cm_event.ird = ep_object->ird; - cm_event.ord = ep_object->ord; - cm_event.provider_data = (void *)ep_object; - - /* - * We have successfully passively accepted the - * incoming connection. - */ - - hvnd_debug("Passive connection accepted!!\n"); - if ((ep_object->cm_id) && - (ep_object->cm_id->event_handler)) { - ep_object->cm_id->event_handler(ep_object->cm_id, &cm_event); - ep_object->cm_state = hvnd_cm_established_sent; - } - - complete(&ep_object->connector_accept_event); - - /* - * Request notification if the other end - * were to disconnect. - */ - if (!ep_add_work_pending(ep_object)) - return; - - ret = hvnd_connector_notify_disconnect(ep_object->nd_dev, - ep_object->uctx, - ep_object->ep_handle, - ep_object); - - if (ret) { - ep_del_work_pending(ep_object); - hvnd_error("Connector notify disconnect failed; ret: %d\n", - ret); - } -} - - -void hvnd_process_cq_event_pending(struct hvnd_ep_obj *ep_object, - int status) -{ - - struct ib_cq *ibcq; - struct hvnd_cq *cq; - - cq = ep_object->cq; - ibcq = &ep_object->cq->ibcq; - - if (!cq->monitor) - return; - - // call the previous CQ complete - if ((status == STATUS_PENDING) && cq->upcall_pending && - ibcq->comp_handler) { - ibcq->comp_handler(ibcq, ibcq->cq_context); - cq->upcall_pending = false; - hvnd_debug("CQ comp_handler called arm_sn=%d\n", cq->arm_sn); - } - - if ((status != STATUS_PENDING) && ibcq->comp_handler && - ibcq->cq_context) { - ibcq->comp_handler(ibcq, ibcq->cq_context); - hvnd_error("CQ comp_handler called status=%x\n", status); - } -} - -void hvnd_process_cq_event_complete(struct hvnd_ep_obj *ep_object, - int status) -{ - struct ib_cq *ibcq; - struct hvnd_cq *cq; - int ret; - - cq = ep_object->cq; - ibcq = &ep_object->cq->ibcq; - - // call hte previous CQ complete - if (cq->upcall_pending && ibcq->comp_handler) { - ibcq->comp_handler(ibcq, ibcq->cq_context); - cq->upcall_pending = false; - hvnd_debug("CQ comp_handler called arm_sn=%d\n", cq->arm_sn); - } - - cq->upcall_pending = true; - if (!ep_add_work_pending(ep_object)) - return; - - ret = hvnd_notify_cq(ep_object->nd_dev, - ep_object->cq, - ND_CQ_NOTIFY_ANY, - (u64)ep_object); - - if (ret) { - ep_del_work_pending(ep_object); -// hvnd_manage_io_state(ep_object, true); - hvnd_error("hvnd_notify_cq failed ret=%d\n", ret); - } - - if ((status != 0) && (status != STATUS_CANCELLED)) { - if (ibcq->event_handler) { - struct ib_event event; - - event.device = ibcq->device; - event.event = IB_EVENT_CQ_ERR; - event.element.cq = ibcq; - ibcq->event_handler(&event, ibcq->cq_context); - - hvnd_warn("CQ event_handler called status=%x\n", - status); - } - } -} - -int init_cm_event(struct hvnd_ep_obj *ep_object, struct iw_cm_event *cm_event, - int event) -{ - struct sockaddr_in *laddr = (struct sockaddr_in *)&cm_event->local_addr; - struct sockaddr_in *raddr = (struct sockaddr_in *)&cm_event->remote_addr; - struct nd_read_limits rd_limits; - union nd_sockaddr_inet local_addr; - union nd_sockaddr_inet remote_addr; - int ret; - - /* - * Now get the local address. - */ - ret = hvnd_connector_get_local_addr(ep_object->nd_dev, - ep_object->uctx, - ep_object->ep_handle, - &local_addr); - - if (ret) { - hvnd_error("Connector get addr failed; ret: %d\n", ret); - return ret; - } - /* - * Now get the remote address. - */ - ret = hvnd_connector_get_peer_addr(ep_object->nd_dev, - ep_object->uctx, - ep_object->ep_handle, - &remote_addr); - - if (ret) { - hvnd_error("Connector get peer addr failed; ret: %d\n", ret); - return ret; - } - - /* - * Get other connection parameters. - */ - - ret = hvnd_connector_get_rd_limits(ep_object->nd_dev, - ep_object->uctx, - ep_object->ep_handle, - &rd_limits); - - if (ret) { - hvnd_error("Connector rd limits failed; ret: %d\n", ret); - return ret; - } - - /* - * XXXKYS: Luke: What about the length of the priv data? - */ - ret = hvnd_connector_get_priv_data(ep_object->nd_dev, - ep_object->uctx, - ep_object->ep_handle, - ep_object->priv_data); - - if (ret) { - hvnd_error("Connector get priv data failed; ret: %d\n", ret); - return ret; - } - /* - * Initialize CM structure. - */ - laddr->sin_addr.s_addr = local_addr.ipv4.sin_addr.s_addr; - hvnd_debug("Local addr is %d\n", laddr->sin_addr.s_addr); - laddr->sin_port = local_addr.ipv4.sin_port; - laddr->sin_family = AF_INET; - - raddr->sin_addr.s_addr = remote_addr.ipv4.sin_addr.s_addr; - hvnd_debug("Remote addr is %d\n", raddr->sin_addr.s_addr); - raddr->sin_port = remote_addr.ipv4.sin_port; - raddr->sin_family = AF_INET; - - // KYS; LUke: is it always 148 bytes? - cm_event->private_data_len = MAX_PRIVATE_DATA_LEN; - cm_event->private_data = ep_object->priv_data; - - cm_event->ird = rd_limits.inbound; - cm_event->ord = rd_limits.outbound; - cm_event->event = event; - - ep_object->ird = cm_event->ird; - ep_object->ord = cm_event->ord; - - return 0; -} - -static void hvnd_refuse_connection(struct hvnd_ep_obj *connector, int status) -{ - struct iw_cm_event cm_event; - - memset(&cm_event, 0, sizeof(cm_event)); - - cm_event.event = IW_CM_EVENT_CONNECT_REPLY; - cm_event.status = status; - - hvnd_debug("returning status %d on connector %p\n", status, connector); - - if (connector->cm_id && connector->cm_id->event_handler) { - connector->cm_id->event_handler(connector->cm_id, &cm_event); - connector->cm_id->rem_ref(connector->cm_id); - connector->cm_state = hvnd_cm_connect_reply_refused; - } -} - -void hvnd_process_events(struct work_struct *work) -{ - struct hvnd_work *wrk; - struct nd_read_limits rd_limits; - struct hvnd_ep_obj *ep_object; - struct hvnd_ep_obj *parent; - struct iw_cm_event cm_event; - struct sockaddr_in *laddr = (struct sockaddr_in *)&cm_event.local_addr; - struct sockaddr_in *raddr = (struct sockaddr_in *)&cm_event.remote_addr; - struct ndv_packet_hdr_control_1 *ctrl_hdr; - union nd_sockaddr_inet local_addr; - union nd_sockaddr_inet remote_addr; - struct pkt_nd_get_connection_listener *connection_pkt; - struct iw_cm_id *cm_id = NULL; - int status; - int ioctl; - int ret; - char priv_data[MAX_PRIVATE_DATA_LEN]; - enum ibv_qp_state new_qp_state; - struct incoming_pkt *incoming_pkt; - unsigned long flags; - - memset(&cm_event, 0, sizeof(cm_event)); - memset(&priv_data, 0, MAX_PRIVATE_DATA_LEN); - - - wrk = container_of(work, struct hvnd_work, work); - - /* - * Now call into the connection manager. - */ - ep_object = (struct hvnd_ep_obj *)wrk->callback_arg; - parent = ep_object->parent; - -process_next: - incoming_pkt = NULL; - spin_lock_irqsave(&ep_object->incoming_pkt_list_lock, flags); - if (!list_empty(&ep_object->incoming_pkt_list)) { - incoming_pkt = list_first_entry(&ep_object->incoming_pkt_list, - struct incoming_pkt, - list_entry); - list_del(&incoming_pkt->list_entry); - } - spin_unlock_irqrestore(&ep_object->incoming_pkt_list_lock, flags); - if (incoming_pkt == NULL) - return; - - ctrl_hdr = (struct ndv_packet_hdr_control_1 *)incoming_pkt->pkt; - status = ctrl_hdr->io_status; - ioctl = ctrl_hdr->io_cntrl_code; - - hvnd_debug("Process Events IOCTL is: %s; iostatus failure: %x in work queue\n", - hvnd_get_op_name(ioctl), status); - - if (status != 0) { - bool log_error = true; - - if (ioctl == IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT && - status == STATUS_DISCONNECTED) // expected - log_error = false; - - if (log_error) - hvnd_warn("Process Events IOCTL is: %s; iostatus failure: %x\n", - hvnd_get_op_name(ioctl), status); - } - - cm_event.status = status; - - switch (ep_object->type) { - case ND_CONNECTOR: - switch (ioctl) { - case IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST: - - if (ep_object->parent != NULL) { - - // Do nothing with this connection request if - // listener is stopping - if (!ep_add_work_pending(ep_object->parent)) - break; - - cm_id = ep_object->parent->cm_id; //Listener - } - - connection_pkt = (struct pkt_nd_get_connection_listener *) ctrl_hdr; - - if ((status == 0) || (status == STATUS_CANCELLED)) { - hvnd_get_incoming_connections(ep_object->parent, - ep_object->parent->nd_dev, - ep_object->uctx); - } - - if (status) - goto get_connection_request_done; - - /* - * Now get the local address. - */ - ret = hvnd_connector_get_local_addr(ep_object->nd_dev, - ep_object->uctx, - ep_object->ep_handle, - &local_addr); - - if (ret) { - hvnd_error("Connector get addr failed; ret: %d\n", - ret); - goto get_connection_request_done; - } - /* - * Now get the remote address. - */ - ret = hvnd_connector_get_peer_addr(ep_object->nd_dev, - ep_object->uctx, - ep_object->ep_handle, - &remote_addr); - - if (ret) { - hvnd_error("Connector get peer addr failed; ret: %d\n", - ret); - goto get_connection_request_done; - } - /* - * Get other connection parameters. - */ - - ret = hvnd_connector_get_rd_limits(ep_object->nd_dev, - ep_object->uctx, - ep_object->ep_handle, - &rd_limits); - - if (ret) { - hvnd_error("Connector rd imits failed; ret: %d\n", - ret); - goto get_connection_request_done; - } - - /* - * XXXKYS: Luke: What about the length of the priv data? - */ - ret = hvnd_connector_get_priv_data( - ep_object->nd_dev, ep_object->uctx, - ep_object->ep_handle, ep_object->priv_data); - - if (ret) { - hvnd_error("Connector get priv data failed; ret: %d\n", - ret); - goto get_connection_request_done; - } - - cm_event.event = IW_CM_EVENT_CONNECT_REQUEST; - cm_event.provider_data = (void *)ep_object; - - laddr->sin_addr.s_addr = local_addr.ipv4.sin_addr.s_addr; - hvnd_debug("Local addr is %d\n", - laddr->sin_addr.s_addr); - laddr->sin_port = local_addr.ipv4.sin_port; - laddr->sin_family = AF_INET; - - raddr->sin_addr.s_addr = remote_addr.ipv4.sin_addr.s_addr; - hvnd_debug("Remote addr is %d\n", - raddr->sin_addr.s_addr); - raddr->sin_port = remote_addr.ipv4.sin_port; - raddr->sin_family = AF_INET; - - // KYS; LUke: is it always 148 bytes? - cm_event.private_data_len = MAX_PRIVATE_DATA_LEN; - cm_event.private_data = ep_object->priv_data; - - cm_event.ird = rd_limits.inbound; - cm_event.ord = rd_limits.outbound; - - ep_object->ird = cm_event.ird; - ep_object->ord = cm_event.ord; - - - if ((cm_id != NULL) && cm_id->event_handler) { - cm_id->event_handler(cm_id, &cm_event); - ep_object->cm_state = hvnd_cm_connect_request_sent; - } - -get_connection_request_done: - if (ep_object->parent != NULL) - ep_del_work_pending(ep_object->parent); - break; - - case IOCTL_ND_CONNECTOR_CONNECT: - - cm_event.event = IW_CM_EVENT_CONNECT_REPLY; - if (status == STATUS_TIMEOUT && - ep_object->connector_connect_retry < 3) { - //TIMEOUT retry - - if (!ep_add_work_pending(ep_object)) - goto refuse_connection; - - hvnd_warn("Connector connect timed out, reconnecting... retry count: %d\n", - ep_object->connector_connect_retry); - ep_object->connector_connect_retry++; - ret = hvnd_send_ioctl_pkt(ep_object->nd_dev, - &ep_object->connector_connect_pkt.hdr, - sizeof(ep_object->connector_connect_pkt), - (u64)&ep_object->connector_connect_pkt); - - if (ret) { - hvnd_error("Connector on time out failed: %d\n", - ret); - ep_del_work_pending(ep_object); - goto refuse_connection; - } - break; - } - -refuse_connection: - if (status) { - cm_event.status = -ECONNREFUSED; - if (status == STATUS_TIMEOUT) - cm_event.status = -ETIMEDOUT; - - hvnd_refuse_connection(ep_object, - cm_event.status); - break; - } - - hvnd_debug("ACTIVE Connection ACCEPTED\n"); - ret = init_cm_event(ep_object, &cm_event, - IW_CM_EVENT_CONNECT_REPLY); - if (ret) { - hvnd_error("init_cm_event failed ret=%d\n", - ret); - goto process_done; - } - - ret = hvnd_connector_complete_connect(ep_object->nd_dev, - ep_object->uctx, - ep_object->ep_handle, - &new_qp_state); - if (ret) { - hvnd_error("connector_complete failed\n"); - goto process_done; - } - - cm_event.provider_data = (void *)ep_object; - - if ((ep_object->cm_id) && - (ep_object->cm_id->event_handler)) { - ep_object->cm_id->event_handler( - ep_object->cm_id, &cm_event); - ep_object->cm_state = hvnd_cm_connect_reply_sent; - } - /* - * Rquest notification if the other end - * were to disconnect. - */ - if (!ep_add_work_pending(ep_object)) - goto process_done; - - ret = hvnd_connector_notify_disconnect( - ep_object->nd_dev, ep_object->uctx, - ep_object->ep_handle, ep_object); - - if (ret) { - ep_del_work_pending(ep_object); - hvnd_error("Connector notify disconnect failed; ret: %d\n", - ret); - } - - break; - - case IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT: - hvnd_process_notify_disconnect(ep_object, status); - break; - - - default: - hvnd_error("Unknown Connector IOCTL\n"); - break; - } - break; - default: - hvnd_error("Unknown endpoint object\n"); - break; - } -process_done: - kfree(incoming_pkt); - ep_del_work_pending(ep_object); - - goto process_next; -} - - -static struct hvnd_ep_obj *hvnd_setup_ep(struct iw_cm_id *cm_id, int ep_type, - struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx) -{ - struct hvnd_ep_obj *ep_object; - int ret; - - ep_object = kzalloc(sizeof(struct hvnd_ep_obj), GFP_KERNEL); - - if (!ep_object) - return NULL; - - ret = hvnd_init_ep(ep_object, cm_id, ep_type, nd_dev, uctx); - - if (ret) { - hvnd_error("hvnd_init_ep failed ret=%d\n", ret); - kfree(ep_object); - return NULL; - } - - return ep_object; -} - -/* - * return value: - * true: I/O state is stopped, we should not do upcall - * flase: I/O state is running and normal - */ -#if 0 -static bool hvnd_manage_io_state(struct hvnd_ep_obj *ep, bool failure) -{ - unsigned long flags; - - spin_lock_irqsave(&ep->ep_lk, flags); - if (ep->to_be_destroyed) { - hvnd_warn("ep being destroyed\n"); - if (ep->io_outstanding) { - hvnd_warn("ep being destroyed i/O pending waking up on %p\n", - &ep->block_event); - complete(&ep->block_event); - ep->io_outstanding = false; - } - spin_unlock_irqrestore(&ep->ep_lk, flags); - return true; - } - if (!failure) - ep->io_outstanding = true; - spin_unlock_irqrestore(&ep->ep_lk, flags); - return false; -} -#endif - -static int hvnd_get_incoming_connections(struct hvnd_ep_obj *listener, - struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx) -{ - struct hvnd_ep_obj *connector; - u64 connector_handle; - int ret; - - /* - * First handle the protocol for - * destruction - outstanding I/O. - */ - -// if (hvnd_manage_io_state(listener, false)) -// return 0; - /* - * Create a connector. - */ - connector = hvnd_setup_ep(listener->cm_id, ND_CONNECTOR, nd_dev, uctx); - if (!connector) { - hvnd_error("hvnd_setup_ep failed\n"); - ret = -ENOMEM; - goto con_alloc_err; - } - - ret = hvnd_cr_connector(nd_dev, uctx, - &connector_handle); - if (ret) { - hvnd_error("hvnd_cr_connector failed ret=%d\n", ret); - goto con_cr_err; - } - - /* - * Now get a connection if one is pending. - */ - connector->ep_handle = connector_handle; - connector->parent = listener; - - if (!ep_add_work_pending(connector)) - goto get_connection_err; - - ret = hvnd_get_connection_listener(nd_dev, uctx, - listener->ep_handle, - connector_handle, - (u64)connector); - - if (ret) { - hvnd_debug("listener_get_connection failed\n"); - ep_del_work_pending(connector); - goto get_connection_err; - } - - hvnd_acquire_uctx_ref(uctx); - listener->outstanding_handle = connector_handle; - listener->outstanding_ep = connector; - hvnd_debug("outstanding handle is %p\n", (void *)connector_handle); - return 0; - -get_connection_err: - hvnd_free_handle(nd_dev, uctx, - connector_handle, - IOCTL_ND_CONNECTOR_FREE); - -con_cr_err: - kfree(connector); -con_alloc_err: -// hvnd_manage_io_state(listener, true); - return ret; -} - -static int hvnd_create_listen(struct iw_cm_id *cm_id, int backlog) -{ - int ret = 0; - struct hvnd_dev *nd_dev; - struct hvnd_ucontext *uctx; - struct hvnd_ep_obj *ep_object; - union nd_sockaddr_inet addr; - union nd_sockaddr_inet local_addr; - u64 listener_handle; - struct sockaddr_in *laddr = (struct sockaddr_in *)&cm_id->local_addr; - union nd_sockaddr_inet og_addr; - - - nd_dev = to_nd_dev(cm_id->device); - uctx = get_uctx(nd_dev, current_pid()); - hvnd_debug("uctx is %p; pid is %d\n", uctx, current_pid()); - - if (cm_id->local_addr.ss_family != AF_INET) { - hvnd_error("cm_id->local_addr.ss_family =%d not AF_INET\n", - cm_id->local_addr.ss_family); - return -ENOSYS; - } - - /* - * If the local address is LOOPBACK or INADDR_ANY, get an an address - * to bind the listener. For now, just get the first address - * available. - */ - - if (IN_LOOPBACK(ntohl(laddr->sin_addr.s_addr)) || - (laddr->sin_addr.s_addr == INADDR_ANY)) { - - hvnd_debug("need to get an address\n"); - ret = hvnd_get_outgoing_rdma_addr(nd_dev, uctx, &og_addr); - - if (ret) { - hvnd_error("failed to get the og address\n"); - return ret; - } - - laddr->sin_addr.s_addr = og_addr.ipv4.sin_addr.s_addr; - } - - cm_id->add_ref(cm_id); - - ep_object = hvnd_setup_ep(cm_id, ND_LISTENER, nd_dev, uctx); - - if (!ep_object) { - hvnd_error("hvnd_setup_ep returned NULL\n"); - goto alloc_err; - } - - ret = hvnd_cr_listener(nd_dev, uctx, - &listener_handle); - if (ret) { - hvnd_error("hvnd_cr_listener failed ret=%d\n", ret); - goto cr_err; - } - - ep_object->ep_handle = listener_handle; - - cm_id->provider_data = ep_object; - - /* - * Now bind the listener. - * IPV4 support only. - */ - memcpy(&addr.ipv4, laddr, sizeof(struct sockaddr_in)); - - ret = hvnd_bind_listener(nd_dev, uctx, - listener_handle, - &addr); - if (ret) { - hvnd_error("hvnd_bind_listener failed ret=%d\n", ret); - goto bind_err; - } - - /* - * Now get the local address. - */ - ret = hvnd_get_addr_listener(nd_dev, uctx, - listener_handle, - &local_addr); - if (ret) { - hvnd_error("hvnd_get_addr_listener failed ret=%d\n", ret); - goto bind_err; - } - - /* - * Now put the listener in the listen mode. - */ - - ret = hvnd_listen_listener(nd_dev, uctx, - listener_handle, - backlog); - - if (ret) { - hvnd_error("hvnd_listen_listener failed ret=%d\n", ret); - goto bind_err; - } - - - /* - * Now get a pending connection if one is pending. - */ - ret = hvnd_get_incoming_connections(ep_object, nd_dev, uctx); - if (ret) { - hvnd_error("hvnd_get_incoming_connections failed ret=%d\n", - ret); - goto bind_err; - } - - hvnd_acquire_uctx_ref(uctx); - hvnd_debug("cm_id=%p\n", cm_id); - return 0; - -bind_err: - hvnd_free_handle(nd_dev, uctx, - listener_handle, - IOCTL_ND_LISTENER_FREE); -cr_err: - kfree(ep_object); -alloc_err: - cm_id->provider_data = NULL; - cm_id->rem_ref(cm_id); - return ret; -} - -static int hvnd_destroy_listen(struct iw_cm_id *cm_id) -{ - struct hvnd_dev *nd_dev; - struct hvnd_ucontext *uctx; - struct hvnd_ep_obj *ep_object; - - nd_dev = to_nd_dev(cm_id->device); - - ep_object = (struct hvnd_ep_obj *)cm_id->provider_data; - - hvnd_debug("uctx is %p\n", ep_object->uctx); - hvnd_debug("Destroying Listener cm_id=%p\n", cm_id); - uctx = ep_object->uctx; - - // make sure there is nothing in progress on this ep - ep_stop(ep_object); - - hvnd_free_handle(nd_dev, uctx, - ep_object->ep_handle, - IOCTL_ND_LISTENER_FREE); - - /* - * We may have an ouststanding connector for - * incoming connection requests; clean it up. - */ - - if (ep_object->outstanding_handle != 0) { - - // make sure there is nothing in progress on this ep - ep_stop(ep_object->outstanding_ep); - - hvnd_free_handle(nd_dev, uctx, - ep_object->outstanding_handle, - IOCTL_ND_CONNECTOR_FREE); - - - hvnd_drop_uctx_ref(nd_dev, uctx); - hvnd_destroy_ep(ep_object->outstanding_ep); - } - - /* - * Now everything should have stopped - */ - - cm_id->rem_ref(cm_id); - hvnd_destroy_ep(ep_object); - cm_id->provider_data = NULL; - hvnd_drop_uctx_ref(nd_dev, uctx); - - hvnd_debug("cm_id=%p\n", cm_id); - return 0; -} - -static void hvnd_qp_add_ref(struct ib_qp *ibqp) -{ - struct hvnd_qp *qp = to_nd_qp(ibqp); - - atomic_inc(&qp->refcnt); -} - -void hvnd_qp_rem_ref(struct ib_qp *ibqp) -{ - struct hvnd_qp *qp = to_nd_qp(ibqp); - - if (atomic_dec_and_test(&qp->refcnt)) - wake_up(&qp->wait); -} - -static DEVICE_ATTR(hw_rev, 0444, hvnd_show_rev, NULL); -static DEVICE_ATTR(fw_ver, 0444, hvnd_show_fw_ver, NULL); -static DEVICE_ATTR(hca_type, 0444, hvnd_show_hca, NULL); -static DEVICE_ATTR(board_id, 0444, hvnd_show_board, NULL); - -static struct device_attribute *hvnd_class_attributes[] = { - &dev_attr_hw_rev, - &dev_attr_fw_ver, - &dev_attr_hca_type, - &dev_attr_board_id, -}; - -int hvnd_register_device(struct hvnd_dev *dev, char *ip_addr, char *mac_addr) -{ - int ret; - - dev->ibdev.owner = THIS_MODULE; - dev->device_cap_flags = IB_DEVICE_LOCAL_DMA_LKEY | IB_DEVICE_MEM_WINDOW; - dev->ibdev.local_dma_lkey = 0; - dev->ibdev.uverbs_cmd_mask = - (1ull << IB_USER_VERBS_CMD_GET_CONTEXT) | - (1ull << IB_USER_VERBS_CMD_QUERY_DEVICE) | - (1ull << IB_USER_VERBS_CMD_QUERY_PORT) | - (1ull << IB_USER_VERBS_CMD_ALLOC_PD) | - (1ull << IB_USER_VERBS_CMD_DEALLOC_PD) | - (1ull << IB_USER_VERBS_CMD_REG_MR) | - (1ull << IB_USER_VERBS_CMD_DEREG_MR) | - (1ull << IB_USER_VERBS_CMD_CREATE_COMP_CHANNEL) | - (1ull << IB_USER_VERBS_CMD_CREATE_CQ) | - (1ull << IB_USER_VERBS_CMD_DESTROY_CQ) | - (1ull << IB_USER_VERBS_CMD_REQ_NOTIFY_CQ) | - (1ull << IB_USER_VERBS_CMD_CREATE_QP) | - (1ull << IB_USER_VERBS_CMD_MODIFY_QP) | - (1ull << IB_USER_VERBS_CMD_QUERY_QP) | - (1ull << IB_USER_VERBS_CMD_POLL_CQ) | - (1ull << IB_USER_VERBS_CMD_DESTROY_QP) | - (1ull << IB_USER_VERBS_CMD_POST_SEND) | - (1ull << IB_USER_VERBS_CMD_POST_RECV); - dev->ibdev.node_type = RDMA_NODE_RNIC; - memcpy(dev->ibdev.node_desc, HVND_NODE_DESC, sizeof(HVND_NODE_DESC)); - memcpy(&dev->ibdev.node_guid, mac_addr, 6); - dev->ibdev.phys_port_cnt = 1; //dev->nports; - dev->ibdev.num_comp_vectors = 1; - dev->ibdev.query_device = hvnd_query_device; - dev->ibdev.query_port = hvnd_query_port; - dev->ibdev.get_link_layer = hvnd_get_link_layer; - dev->ibdev.query_pkey = hvnd_query_pkey; - dev->ibdev.query_gid = hvnd_query_gid; - dev->ibdev.alloc_ucontext = hvnd_alloc_ucontext; - dev->ibdev.dealloc_ucontext = hvnd_dealloc_ucontext; - dev->ibdev.mmap = hvnd_mmap; - dev->ibdev.alloc_pd = hvnd_allocate_pd; - dev->ibdev.dealloc_pd = hvnd_deallocate_pd; - dev->ibdev.create_ah = hvnd_ah_create; - dev->ibdev.destroy_ah = hvnd_ah_destroy; - dev->ibdev.create_qp = hvnd_ib_create_qp; - dev->ibdev.modify_qp = hvnd_ib_modify_qp; - dev->ibdev.query_qp = hvnd_ib_query_qp; - dev->ibdev.destroy_qp = hvnd_destroy_qp; - dev->ibdev.create_cq = hvnd_ib_create_cq; - dev->ibdev.destroy_cq = hvnd_ib_destroy_cq; - dev->ibdev.resize_cq = hvnd_resize_cq; - dev->ibdev.poll_cq = hvnd_poll_cq; - dev->ibdev.get_dma_mr = hvnd_get_dma_mr; - dev->ibdev.reg_user_mr = hvnd_reg_user_mr; - dev->ibdev.dereg_mr = hvnd_dereg_mr; - dev->ibdev.alloc_mw = hvnd_alloc_mw; - dev->ibdev.dealloc_mw = hvnd_dealloc_mw; - dev->ibdev.attach_mcast = hvnd_multicast_attach; - dev->ibdev.detach_mcast = hvnd_multicast_detach; - dev->ibdev.process_mad = hvnd_process_mad; - dev->ibdev.req_notify_cq = hvnd_arm_cq; - dev->ibdev.post_send = hvnd_post_send; - dev->ibdev.post_recv = hvnd_post_receive; - dev->ibdev.uverbs_abi_ver = MLX4_IB_UVERBS_ABI_VERSION; - - dev->ibdev.get_port_immutable = hvnd_get_port_immutable; - - //DMA ops for mapping all possible addresses - dev->ibdev.dev.parent = &(dev->hvdev->device); - dev->ibdev.dev.dma_ops = &vmbus_dma_ops; - dev->ibdev.dev.dma_mask = (u64 *) DMA_BIT_MASK(64); - dev->ibdev.dev.coherent_dma_mask = DMA_BIT_MASK(64); - - dev->ibdev.iwcm = kmalloc(sizeof(struct iw_cm_verbs), GFP_KERNEL); - if (!dev->ibdev.iwcm) - return -ENOMEM; - - dev->ibdev.iwcm->connect = hvnd_connect; - dev->ibdev.iwcm->accept = hvnd_accept_cr; - dev->ibdev.iwcm->reject = hvnd_reject_cr; - dev->ibdev.iwcm->create_listen = hvnd_create_listen; - dev->ibdev.iwcm->destroy_listen = hvnd_destroy_listen; - dev->ibdev.iwcm->add_ref = hvnd_qp_add_ref; - dev->ibdev.iwcm->rem_ref = hvnd_qp_rem_ref; - dev->ibdev.iwcm->get_qp = hvnd_get_qp; - - strlcpy(dev->ibdev.name, "mlx4_%d", IB_DEVICE_NAME_MAX); - ret = ib_register_device(&dev->ibdev, NULL); - if (ret) { - hvnd_error("ib_register_device failed ret=%d\n", ret); - goto bail1; - } - -#if 0 - for (i = 0; i < ARRAY_SIZE(hvnd_class_attributes); ++i) { - ret = device_create_file(&dev->ibdev.dev, - hvnd_class_attributes[i]); - if (ret) { - hvnd_error("device_create_file failed ret=%d\n", ret); - goto bail2; - } - } -#endif - - dev->ib_active = true; - return 0; -//bail2: -// ib_unregister_device(&dev->ibdev); -bail1: - kfree(dev->ibdev.iwcm); - return ret; -} - -void hvnd_unregister_device(struct hvnd_dev *dev) -{ - int i; - - for (i = 0; i < ARRAY_SIZE(hvnd_class_attributes); ++i) - device_remove_file(&dev->ibdev.dev, - hvnd_class_attributes[i]); - ib_unregister_device(&dev->ibdev); - kfree(dev->ibdev.iwcm); - ib_dealloc_device((struct ib_device *)dev); -} - -static int hvnd_try_bind_nic(unsigned char *mac, __be32 ip) -{ - int ret; - struct hvnd_dev *nd_dev = g_nd_dev; - - mutex_lock(&nd_dev->bind_mutex); - if (nd_dev->bind_complete) { - mutex_unlock(&nd_dev->bind_mutex); - return 1; - } - - memcpy(nd_dev->mac_addr, mac, 6); - *(__be32 *)(nd_dev->ip_addr) = ip; - - /* - * Bind the NIC. - */ - hvnd_info("trying to bind to IP %pI4 MAC %pM\n", - nd_dev->ip_addr, nd_dev->mac_addr); - ret = hvnd_bind_nic(nd_dev, false, nd_dev->ip_addr, nd_dev->mac_addr); - if (ret || nd_dev->bind_pkt.pkt_hdr.status) { - mutex_unlock(&nd_dev->bind_mutex); - return 1; - } - - /* if we reach here, this means bind_nic is a success */ - hvnd_error("successfully bound to IP %pI4 MAC %pM\n", - nd_dev->ip_addr, nd_dev->mac_addr); - complete(&nd_dev->addr_set); - nd_dev->bind_complete = 1; - mutex_unlock(&nd_dev->bind_mutex); - - ret = hvnd_register_device(nd_dev, nd_dev->ip_addr, nd_dev->mac_addr); - - if (!ret) - return 0; - - hvnd_error("hvnd_register_device failed ret=%d\n", ret); - - /* roll back all allocated resources on error */ - iounmap(nd_dev->mmio_virt); - release_resource(&nd_dev->mmio_resource); - - vmbus_close(nd_dev->hvdev->channel); - ib_dealloc_device((struct ib_device *)nd_dev); - - return 1; -} - -static void hvnd_inetaddr_event_up(unsigned long event, struct in_ifaddr *ifa) -{ - hvnd_try_bind_nic(ifa->ifa_dev->dev->dev_addr, ifa->ifa_address); -} - -static int hvnd_inetaddr_event(struct notifier_block *notifier, - unsigned long event, void *ptr) -{ - struct in_ifaddr *ifa = ptr; - - switch (event) { - case NETDEV_UP: - hvnd_inetaddr_event_up(event, ifa); - break; - default: - hvnd_debug("Received inetaddr event %lu\n", event); - } - - return NOTIFY_DONE; -} - -static struct notifier_block hvnd_inetaddr_notifier = { - .notifier_call = hvnd_inetaddr_event, -}; - -static int start_bind_nic(void) -{ - struct net_device *dev; - struct in_device *idev; - struct in_ifaddr *ifa; - - register_inetaddr_notifier(&hvnd_inetaddr_notifier); - - rtnl_lock(); - for_each_netdev(&init_net, dev) { - idev = in_dev_get(dev); - if (!idev) - continue; - for (ifa = (idev)->ifa_list; - ifa && !(ifa->ifa_flags&IFA_F_SECONDARY); - ifa = ifa->ifa_next) { - hvnd_try_bind_nic(dev->dev_addr, ifa->ifa_address); - } - } - rtnl_unlock(); - - return 0; -} - -static int hvnd_probe(struct hv_device *dev, - const struct hv_vmbus_device_id *dev_id) -{ - struct hvnd_dev *nd_dev; - int ret = 0; - - hvnd_debug("hvnd starting\n"); - - nd_dev = (struct hvnd_dev *)ib_alloc_device(sizeof(struct hvnd_dev)); - if (!nd_dev) { - ret = -ENOMEM; - goto err_out0; - } - - nd_dev->hvdev = dev; - /* - * We are going to masquerade as MLX4 device; - * Set the vendor and device ID accordingly. - */ - dev->vendor_id = 0x15b3; //Mellanox - dev->device_id = 0x1003; //Mellanox HCA - INIT_LIST_HEAD(&nd_dev->listentry); - spin_lock_init(&nd_dev->uctxt_lk); - nd_dev->ib_active = false; - - /* - * Initialize the state for the id table. - */ - spin_lock_init(&nd_dev->id_lock); - idr_init(&nd_dev->cqidr); - idr_init(&nd_dev->qpidr); - idr_init(&nd_dev->mmidr); - idr_init(&nd_dev->irpidr); - idr_init(&nd_dev->uctxidr); - - atomic_set(&nd_dev->open_cnt, 0); - - sema_init(&nd_dev->query_pkt_sem, 1); - - ret = vmbus_open(dev->channel, HVND_RING_SZ, HVND_RING_SZ, NULL, 0, - hvnd_callback, dev); - - if (ret) { - hvnd_error("vmbus_open failed ret=%d\n", ret); - goto err_out1; - } - - hv_set_drvdata(dev, nd_dev); - - ret = hvnd_negotiate_version(nd_dev); - - if (ret) { - hvnd_error("hvnd_negotiate_version failed ret=%d\n", ret); - goto err_out2; - } - - /* - * Register resources with the host. - */ - ret = hvnd_init_resources(nd_dev); - if (ret) { - hvnd_error("hvnd_init_resources failed ret=%d\n", ret); - goto err_out2; - } - - /* - * Try to bind every NIC to ND channel, - * ND host will only return success for the correct one - */ - nd_dev->bind_complete = 0; - mutex_init(&nd_dev->bind_mutex); - init_completion(&nd_dev->addr_set); - - g_nd_dev = nd_dev; - start_bind_nic(); - - return 0; - -err_out2: - vmbus_close(dev->channel); - -err_out1: - ib_dealloc_device((struct ib_device *)nd_dev); - -err_out0: - return ret; -} - -static int hvnd_remove(struct hv_device *dev) -{ - struct hvnd_dev *nd_dev = hv_get_drvdata(dev); - - unregister_inetaddr_notifier(&hvnd_inetaddr_notifier); - hvnd_bind_nic(nd_dev, true, nd_dev->ip_addr, nd_dev->mac_addr); - hvnd_unregister_device(nd_dev); - vmbus_close(dev->channel); - iounmap(nd_dev->mmio_virt); - release_resource(&nd_dev->mmio_resource); - return 0; -} - -static const struct hv_vmbus_device_id id_table[] = { - /* VMBUS RDMA class guid */ - /* 8c2eaf3d-32a7-4b09-ab99-bd1f1c86b501 */ - { HV_ND_GUID, }, - { }, -}; - -MODULE_DEVICE_TABLE(vmbus, id_table); - -static struct hv_driver hvnd_drv = { - .name = "hv_guest_rdma", - .id_table = id_table, - .probe = hvnd_probe, - .remove = hvnd_remove, -}; - - -static int __init init_hvnd_drv(void) -{ - - pr_info("Registered HyperV networkDirect Driver\n"); - hvnd_addr_init(); - return vmbus_driver_register(&hvnd_drv); -} - -static void exit_hvnd_drv(void) -{ - pr_info("De-Registered HyperV networkDirect Driver\n"); - hvnd_addr_deinit(); - vmbus_driver_unregister(&hvnd_drv); -} - - -module_init(init_hvnd_drv); -module_exit(exit_hvnd_drv); - -MODULE_DESCRIPTION("Hyper-V NetworkDirect Driver"); -MODULE_LICENSE("GPL"); -MODULE_VERSION("4.1.3"); diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/user.h b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/user.h deleted file mode 100644 index 07e6769ef43b..000000000000 --- a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/user.h +++ /dev/null @@ -1,107 +0,0 @@ -/* - * Copyright (c) 2007 Cisco Systems, Inc. All rights reserved. - * Copyright (c) 2007, 2008 Mellanox Technologies. All rights reserved. - * - * This software is available to you under a choice of one of two - * licenses. You may choose to be licensed under the terms of the GNU - * General Public License (GPL) Version 2, available from the file - * COPYING in the main directory of this source tree, or the - * OpenIB.org BSD license below: - * - * Redistribution and use in source and binary forms, with or - * without modification, are permitted provided that the following - * conditions are met: - * - * - Redistributions of source code must retain the above - * copyright notice, this list of conditions and the following - * disclaimer. - * - * - Redistributions in binary form must reproduce the above - * copyright notice, this list of conditions and the following - * disclaimer in the documentation and/or other materials - * provided with the distribution. - * - * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, - * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND - * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS - * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN - * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN - * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - * SOFTWARE. - */ - -#ifndef MLX4_IB_USER_H -#define MLX4_IB_USER_H - -#include <linux/types.h> - -/* - * Increment this value if any changes that break userspace ABI - * compatibility are made. - */ - -#define MLX4_IB_UVERBS_NO_DEV_CAPS_ABI_VERSION 3 -#define MLX4_IB_UVERBS_ABI_VERSION 4 - -/* - * Make sure that all structs defined in this file remain laid out so - * that they pack the same way on 32-bit and 64-bit architectures (to - * avoid incompatibility between 32-bit userspace and 64-bit kernels). - * In particular do not use pointer types -- pass pointers in __u64 - * instead. - */ - -struct mlx4_ib_alloc_ucontext_resp_v3 { - __u32 qp_tab_size; - __u16 bf_reg_size; - __u16 bf_regs_per_page; -}; - -struct mlx4_ib_alloc_ucontext_resp { - __u32 dev_caps; - __u32 qp_tab_size; - __u16 bf_reg_size; - __u16 bf_regs_per_page; - __u32 cqe_size; -}; - -struct mlx4_ib_alloc_pd_resp { - __u32 pdn; - __u32 reserved; -}; - -struct mlx4_ib_create_cq { - __u64 buf_addr; - __u64 db_addr; -}; - -struct mlx4_ib_create_cq_resp { - __u32 cqn; - __u32 reserved; -}; - -struct mlx4_ib_resize_cq { - __u64 buf_addr; -}; - -struct mlx4_ib_create_srq { - __u64 buf_addr; - __u64 db_addr; -}; - -struct mlx4_ib_create_srq_resp { - __u32 srqn; - __u32 reserved; -}; - -struct mlx4_ib_create_qp { - __u64 buf_addr; - __u64 db_addr; - __u8 log_sq_bb_count; - __u8 log_sq_stride; - __u8 sq_no_prefetch; - __u8 reserved[5]; -}; - -#endif /* MLX4_IB_USER_H */ diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/vmbus_rdma.c b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/vmbus_rdma.c deleted file mode 100644 index daf68dcccb1b..000000000000 --- a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/vmbus_rdma.c +++ /dev/null @@ -1,3074 +0,0 @@ -/* - * Copyright (c) 2014, Microsoft Corporation. - * - * Author: - * K. Y. Srinivasan <kys@microsoft.com> - * - * This program is free software; you can redistribute it and/or modify it - * under the terms of the GNU General Public License version 2 as published - * by the Free Software Foundation. - * - * This program is distributed in the hope that it will be useful, but - * WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or - * NON INFRINGEMENT. See the GNU General Public License for more - * details. - * - * Bug fixes/enhancements: Long Li <longli@microsoft.com> - */ - -#include <linux/completion.h> -#include <linux/module.h> -#include <linux/errno.h> -#include <linux/hyperv.h> -#include <linux/efi.h> -#include <linux/slab.h> -#include <linux/cred.h> -#include <linux/sched.h> -#include <linux/types.h> -#include <linux/scatterlist.h> -#include <rdma/ib_umem.h> -#include <rdma/ib_user_verbs.h> -#include <asm-generic/delay.h> - -#include "vmbus_rdma.h" - -/* - * We only have a single rdma device on the host; - * have a single receive buffer. - */ - - -static char hvnd_recv_buffer[PAGE_SIZE * 4]; - -static atomic_t irp_local_hdl; - -char *hvnd_get_op_name(int ioctl) -{ - switch (ioctl) { - case IOCTL_ND_PROVIDER_INIT: - return "IOCTL_ND_PROVIDER_INIT\n"; - case IOCTL_ND_PROVIDER_BIND_FILE: - return "IOCTL_ND_PROVIDER_BIND_FILE\n"; - case IOCTL_ND_ADAPTER_OPEN: - return "IOCTL_ND_ADAPTER_OPEN\n"; - - case IOCTL_ND_ADAPTER_CLOSE: - return "IOCTL_ND_ADAPTER_CLOSE\n"; - - case IOCTL_ND_ADAPTER_QUERY: - return "IOCTL_ND_ADAPTER_QUERY\n"; - - case IOCTL_ND_PD_CREATE: - return "IOCTL_ND_PD_CREATE\n"; - - case IOCTL_ND_PD_FREE: - return "IOCTL_ND_PD_FREE\n"; - - case IOCTL_ND_CQ_CREATE: - return "IOCTL_ND_CQ_CREATE\n"; - - case IOCTL_ND_CQ_FREE: - return "IOCTL_ND_CQ_FREE\n"; - case IOCTL_ND_CQ_CANCEL_IO: - return "IOCTL_ND_CQ_CANCEL_IO\n"; - case IOCTL_ND_CQ_GET_AFFINITY: - return "IOCTL_ND_CQ_GET_AAFINITY\n"; - case IOCTL_ND_CQ_MODIFY: - return "IOCTL_ND_CQ_MODIFY\n"; - - case IOCTL_ND_CQ_NOTIFY: - return "IOCTL_ND_CQ_NOTIFY\n"; - - - case IOCTL_ND_LISTENER_CREATE: - return "IOCTL_ND_LISTENER_CREATE\n"; - - case IOCTL_ND_LISTENER_FREE: - return "IOCTL_ND_LISTENER_FREE\n"; - - case IOCTL_ND_QP_FREE: - return "IOCTL_ND_QP_FREE\n"; - - case IOCTL_ND_CONNECTOR_CANCEL_IO: - return "IOCTL_ND_CONNECTOR_CANCEL_IO\n"; - - case IOCTL_ND_LISTENER_CANCEL_IO: - return "IOCTL_ND_LISTENER_CANCEL_IO\n"; - - case IOCTL_ND_LISTENER_BIND: - return "IOCTL_ND_LISTENER_BIND\n"; - - case IOCTL_ND_LISTENER_LISTEN: - return "IOCTL_ND_LISTENER_LISTEN\n"; - - case IOCTL_ND_LISTENER_GET_ADDRESS: - return "IOCTL_ND_LISTENER_GET_ADDRESS\n"; - - case IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST: - return "IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST\n"; - - - - case IOCTL_ND_CONNECTOR_CREATE: - return "IOCTL_ND_CONNECTOR_CREATE\n"; - - case IOCTL_ND_CONNECTOR_FREE: - return "IOCTL_ND_CONNECTOR_FREE\n"; - - case IOCTL_ND_CONNECTOR_BIND: - return "IOCTL_ND_CONNECTOR_BIND\n"; - - // KYS: ALERT: ASYNCH Operation - case IOCTL_ND_CONNECTOR_CONNECT: - return "IOCTL_ND_CONNECTOR_CONNECT\n"; - - case IOCTL_ND_CONNECTOR_COMPLETE_CONNECT: - return "IOCTL_ND_CONNECTOR_COMPLETE_CONNECT\n"; - - // KYS: ALERT: ASYNCH Operation - case IOCTL_ND_CONNECTOR_ACCEPT: - return "IOCTL_ND_CONNECTOR_ACCEPT\n"; - - case IOCTL_ND_CONNECTOR_REJECT: - return "IOCTL_ND_CONNECTOR_REJECT\n"; - - case IOCTL_ND_CONNECTOR_GET_READ_LIMITS: - return "IOCTL_ND_CONNECTOR_GET_READ_LIMITS\n"; - - case IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA: - return "IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA\n"; - - case IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS: - return "IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS\n"; - - case IOCTL_ND_CONNECTOR_GET_ADDRESS: - return "IOCTL_ND_CONNECTOR_GET_ADDRESS\n"; - - // KYS: ALERT: ASYNCH Operation - case IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT: - return "IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT\n"; - - // KYS: ALERT: ASYNCH Operation - case IOCTL_ND_CONNECTOR_DISCONNECT: - return "IOCTL_ND_CONNECTOR_DISCONNECT\n"; - - - - case IOCTL_ND_QP_CREATE: - return "IOCTL_ND_QP_CREATE\n"; - - case IOCTL_ND_MR_CREATE: - return "IOCTL_ND_MR_CREATE\n"; - - case IOCTL_ND_MR_FREE: - return "IOCTL_ND_MR_FREE\n"; - case IOCTL_ND_MR_REGISTER: - return "IOCTL_ND_MR_REGISTER\n"; - case IOCTL_ND_MR_DEREGISTER: - return "IOCTL_ND_MR_DEREGISTER\n"; - case IOCTL_ND_MR_CANCEL_IO: - return "IOCTL_ND_MR_CANCEL_IO\n"; - case IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST: - return "IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST\n"; - case IOCTL_ND_QP_FLUSH: - return "IOCTL_ND_QP_FLUSH\n"; - - default: - return "Unknown IOCTL\n"; - } -} -int get_irp_handle(struct hvnd_dev *nd_dev, u32 *local, void *irp_ctx) -{ - unsigned int local_handle; - int ret; - - local_handle = atomic_inc_return(&irp_local_hdl); - *local = local_handle; - - /* - * Now asssociate the local handle with the pointer. - */ - ret = insert_handle(nd_dev, &nd_dev->irpidr, irp_ctx, local_handle); - hvnd_debug("irp_ctx=%p local_handle=%u\n", irp_ctx, local_handle); - - if (ret) { - hvnd_error("insert_handle failed ret=%d\n", ret); - return ret; - } - - return 0; -} - -void put_irp_handle(struct hvnd_dev *nd_dev, u32 irp) -{ - remove_handle(nd_dev, &nd_dev->irpidr, irp); - -} - -static void init_pfn(u64 *pfn, void *addr, u32 length) -{ - int i; - u32 offset = offset_in_page(addr); - u32 num_pfn = DIV_ROUND_UP(offset + length, PAGE_SIZE); - - for (i = 0; i < num_pfn; i++) - pfn[i] = virt_to_phys((u8 *)addr + - (PAGE_SIZE * i)) >> PAGE_SHIFT; -} - - -static void user_va_init_pfn(u64 *pfn, struct ib_umem *umem) -{ - int entry; - struct scatterlist *sg; - int i = 0; - - for_each_sg(umem->sg_head.sgl, sg, umem->nmap, entry) { - pfn[i++] = page_to_pfn(sg_page(sg)); - } -} - -static u32 get_local_handle(void *p) -{ - u64 val = (unsigned long)p; - - return (u32)val; -} - -static int hvnd_send_pg_buffer(struct hvnd_dev *nd_dev, - struct vmbus_packet_mpb_array *desc, - u32 desc_size, - void *buffer, - u32 bufferlen, u64 cookie) -{ - int ret; - int t; - struct hvnd_cookie hvnd_cookie; - - hvnd_cookie.pkt = (void *)cookie; - init_completion(&hvnd_cookie.host_event); - - ret = vmbus_sendpacket_mpb_desc(nd_dev->hvdev->channel, - desc, - desc_size, - buffer, bufferlen, - (u64)(&hvnd_cookie)); - - if (ret) { - hvnd_error("vmbus_sendpacket_mpb_desc failed ret=%d\n", ret); - goto err; - } - - t = wait_for_completion_timeout(&hvnd_cookie.host_event, 500*HZ); - - if (t == 0) { - hvnd_error("wait_for_completion_timeout timed out\n"); - ret = -ETIMEDOUT; - } - -err: - return ret; -} - -static int hvnd_send_packet(struct hvnd_dev *nd_dev, void *buffer, - u32 bufferlen, u64 cookie, bool block) -{ - int ret; - int t; - struct hvnd_cookie hvnd_cookie; - - hvnd_cookie.pkt = (void *)cookie; - init_completion(&hvnd_cookie.host_event); - - ret = vmbus_sendpacket(nd_dev->hvdev->channel, buffer, bufferlen, - (u64)(&hvnd_cookie), VM_PKT_DATA_INBAND, - VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED); - - if (ret) { - hvnd_error("vmbus_send pkt failed: %d\n", ret); - goto err; - } - - if (!block) - return ret; - - t = wait_for_completion_timeout(&hvnd_cookie.host_event, 500*HZ); - - if (t == 0) { - hvnd_error("wait_for_completion_timeout timed out\n"); - ret = -ETIMEDOUT; - } - -err: - return ret; -} - -static int hvnd_send_pgbuf_ioctl_pkt(struct hvnd_dev *nd_dev, - struct vmbus_packet_mpb_array *desc, - u32 desc_size, - struct ndv_packet_hdr_control_1 *hdr, - u32 pkt_size, u64 cookie) -{ - int ret; - int ioctl; - - ioctl = hdr->io_cntrl_code; - - - ret = hvnd_send_pg_buffer(nd_dev, desc, desc_size, - hdr, pkt_size, cookie); - - if (ret) - return ret; - - if (hdr->pkt_hdr.status != 0) { - hvnd_error("IOCTL: %s failed; status is %x\n", - hvnd_get_op_name(ioctl), - hdr->pkt_hdr.status); - return -EINVAL; - } - - switch (hdr->io_status) { - case STATUS_SUCCESS: - case STATUS_PENDING: - return 0; - - default: - hvnd_error("IOCTL: %s failed io status is %x\n", - hvnd_get_op_name(ioctl), hdr->io_status); - return -EINVAL; - } -} - -int hvnd_send_ioctl_pkt(struct hvnd_dev *nd_dev, - struct ndv_packet_hdr_control_1 *hdr, - u32 pkt_size, u64 cookie) -{ - int ret; - int ioctl; - bool block; - - block = (hdr->irp_handle.val64 == 0) ? true : false; - - - ioctl = hdr->io_cntrl_code; - - ret = hvnd_send_packet(nd_dev, hdr, pkt_size, cookie, block); - - if (ret) - return ret; - - if (!block) - return ret; - - if (hdr->pkt_hdr.status != 0) { - hvnd_error("IOCTL: %s failed; status is %x\n", - hvnd_get_op_name(ioctl), hdr->pkt_hdr.status); - return -EINVAL; - } - - switch (hdr->io_status) { - case STATUS_SUCCESS: - case STATUS_PENDING: - return 0; - - default: - hvnd_warn("IOCTL: %s failed io status is %x\n", - hvnd_get_op_name(ioctl), hdr->io_status); - return -EINVAL; - } -} - -void hvnd_init_hdr(struct ndv_packet_hdr_control_1 *hdr, - u32 data_sz, u32 local, u32 remote, - u32 ioctl_code, - u32 ext_data_sz, u32 ext_data_offset, - u64 irp_handle) - -{ - int pkt_type; - - pkt_type = NDV_PKT_ID1_CONTROL; - NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); - hdr->pkt_hdr.packet_type = pkt_type; - hdr->pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); - hdr->pkt_hdr.data_sz = data_sz; - - hdr->pkt_hdr.status = 0; - - hdr->file_handle.local = local; - hdr->file_handle.remote = remote; - hdr->irp_handle.val64 = irp_handle; - - hdr->io_cntrl_code = ioctl_code; - hdr->output_buf_sz = data_sz - ext_data_sz; - hdr->input_buf_sz = data_sz - ext_data_sz; - - hdr->input_output_buf_offset = 0; - - hdr->extended_data.size = ext_data_sz; - hdr->extended_data.offset = ext_data_offset; -} - - -int hvnd_create_file(struct hvnd_dev *nd_dev, void *uctx, - struct ndv_pkt_hdr_create_1 *create, u32 file_flags) -{ - int ret; - int pkt_type; - - - pkt_type = NDV_PKT_ID1_CREATE; - NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); - create->pkt_hdr.packet_type = pkt_type; - create->pkt_hdr.hdr_sz = sizeof(struct ndv_pkt_hdr_create_1); - create->pkt_hdr.data_sz = 0; - - create->handle.local = get_local_handle(uctx); - create->access_mask = STANDARD_RIGHTS_ALL; - create->open_options = OPEN_EXISTING; - create->file_attributes = FILE_ATTRIBUTE_NORMAL | file_flags; - create->share_access = FILE_SHARE_ALL; - - ret = hvnd_send_packet(nd_dev, create, - sizeof(struct ndv_pkt_hdr_create_1), - (unsigned long)create, true); - return ret; -} - -int hvnd_cleanup_file(struct hvnd_dev *nd_dev, u32 local, u32 remote) -{ - int ret; - int pkt_type; - struct ndv_pkt_hdr_cleanup_1 cleanup_pkt; - - - pkt_type = NDV_PKT_ID1_CLEANUP; - NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); - - cleanup_pkt.pkt_hdr.packet_type = pkt_type; - cleanup_pkt.pkt_hdr.hdr_sz = sizeof(struct ndv_pkt_hdr_create_1); - cleanup_pkt.pkt_hdr.data_sz = 0; - - cleanup_pkt.handle.local = local; - cleanup_pkt.handle.remote = remote; - - ret = hvnd_send_packet(nd_dev, &cleanup_pkt, - sizeof(struct ndv_pkt_hdr_create_1), - (unsigned long)&cleanup_pkt, true); - return ret; -} - - -static int hvnd_do_ioctl(struct hvnd_dev *nd_dev, u32 ioctl, - struct pkt_nd_provider_ioctl *pkt, - union ndv_context_handle *hdr_handle, - struct nd_handle *ioctl_handle, - u8 *buf, u32 buf_len, bool c_in, bool c_out, u64 irp_val) -{ - int ret; - int pkt_type; - - pkt_type = NDV_PKT_ID1_CONTROL; - NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); - - pkt->hdr.pkt_hdr.packet_type = pkt_type; - pkt->hdr.pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); - pkt->hdr.pkt_hdr.data_sz = (sizeof(struct pkt_nd_provider_ioctl) - - sizeof(struct ndv_packet_hdr_control_1)); - - pkt->hdr.file_handle.local = hdr_handle->local; - pkt->hdr.file_handle.remote = hdr_handle->remote; - hvnd_debug("create handle local: %x remote: %x\n", - hdr_handle->local, hdr_handle->remote); - - pkt->hdr.irp_handle.val64 = irp_val; - - pkt->hdr.io_cntrl_code = ioctl; - pkt->hdr.output_buf_sz = sizeof(struct nd_ioctl); - pkt->hdr.input_buf_sz = sizeof(struct nd_ioctl); - pkt->hdr.input_output_buf_offset = 0; - memset(&pkt->ioctl.handle, 0, sizeof(struct nd_handle)); - pkt->ioctl.handle.version = ND_VERSION_1; - - switch (ioctl) { - case IOCTL_ND_PROVIDER_BIND_FILE: - pkt->ioctl.handle.handle = ioctl_handle->handle; - break; - default: - break; - }; - - /* - * Copy the input buffer, if needed. - */ - - if (c_in && (buf != NULL)) - memcpy(pkt->ioctl.raw_buffer, buf, buf_len); - - ret = hvnd_send_packet(nd_dev, pkt, - sizeof(struct pkt_nd_provider_ioctl), - (unsigned long)pkt, true); - - if (ret) - return ret; - - if (c_out && (buf != NULL)) - memcpy(buf, pkt->ioctl.raw_buffer, buf_len); - - return ret; -} - -static int idr_callback(int id, void *p, void *data) -{ - if (p == data) - return id; - return 0; -} - -void remove_uctx(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx) -{ - int pid = current_pid(); - unsigned long flags; - int id; - - if (get_uctx(nd_dev, pid) == uctx) - remove_handle(nd_dev, &nd_dev->uctxidr, pid); - else { - hvnd_warn("uctx %p not found on pid %d, doing a idr search\n", - uctx, current_pid()); - - spin_lock_irqsave(&nd_dev->id_lock, flags); - id = idr_for_each(&nd_dev->uctxidr, idr_callback, uctx); - spin_unlock_irqrestore(&nd_dev->id_lock, flags); - - if (id) - remove_handle(nd_dev, &nd_dev->uctxidr, id); - else { - hvnd_error("uctx %p not found in idr table\n", uctx); - return; - } - } - - kfree(uctx); -} - -int hvnd_close_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx) -{ - int ret; - - /* - * First close the adaptor. - */ - - ret = hvnd_free_handle(nd_dev, uctx, - uctx->adaptor_hdl, - IOCTL_ND_ADAPTER_CLOSE); - - if (ret) - hvnd_error("Adaptor close failed; ret is %x\n", ret); - - /* - * Now close the two files we created. - */ - - ret = hvnd_cleanup_file(nd_dev, uctx->file_handle_ovl.local, - uctx->file_handle_ovl.remote); - - if (ret) - hvnd_error("file cleanup failed; ret is %x\n", ret); - - ret = hvnd_cleanup_file(nd_dev, uctx->file_handle.local, - uctx->file_handle.remote); - - if (ret) - hvnd_error("File cleanup failed; ret is %x\n", ret); - - /* - * Remove the uctx from the ID table. - */ - remove_uctx(nd_dev, uctx); - - return 0; -} - -int hvnd_open_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx) -{ - int ret; - struct pkt_nd_provider_ioctl *pr_init = &uctx->pr_init_pkt; - int pkt_type; - struct nd_handle ioctl_handle; - struct pkt_nd_open_adapter *pr_o_adap = &uctx->o_adap_pkt; - - ret = hvnd_create_file(nd_dev, uctx, &uctx->create_pkt, 0); - if (ret) { - hvnd_error("hvnd_create_file failed ret=%d\n", ret); - goto error_cr; - } - - if (uctx->create_pkt.pkt_hdr.status != 0) { - hvnd_error("create File failed; status is %d\n", - uctx->create_pkt.pkt_hdr.status); - ret = -EINVAL; - goto error_cr; - } - - uctx->file_handle.local = uctx->create_pkt.handle.local; - uctx->file_handle.remote = uctx->create_pkt.handle.remote; - hvnd_debug("INITIALIZE PROVIDER\n"); - /* - * Now Initialize the Provider. - */ - ioctl_handle.handle = 0; - ret = hvnd_do_ioctl(nd_dev, IOCTL_ND_PROVIDER_INIT, pr_init, - &uctx->create_pkt.handle, - &ioctl_handle, NULL, 0, false, false, 0); - - if (ret) { - ret = -EINVAL; - goto error_pr_init; - } - - if (pr_init->hdr.pkt_hdr.status != 0) { - hvnd_error("Provider INIT failed; status is %d\n", - pr_init->hdr.pkt_hdr.status); - ret = -EINVAL; - goto error_pr_init; - } - - if (pr_init->hdr.io_status != 0) { - hvnd_error("Provider INIT failed; io status is %d\n", - pr_init->hdr.io_status); - ret = -EINVAL; - goto error_pr_init; - } - - /* - * Now create the overlap file. - */ - - hvnd_debug("CREATE OVERLAP FILE\n"); - ret = hvnd_create_file(nd_dev, uctx, &uctx->create_pkt_ovl, - FILE_FLAG_OVERLAPPED); - if (ret) { - hvnd_error("hvnd_create_file failed ret=%d\n", ret); - goto error_pr_init; - } - - if (uctx->create_pkt_ovl.pkt_hdr.status != 0) { - hvnd_error("create Overlap File failed; status is %d\n", - uctx->create_pkt_ovl.pkt_hdr.status); - ret = -EINVAL; - goto error_pr_init; - } - uctx->file_handle_ovl.local = uctx->create_pkt_ovl.handle.local; - uctx->file_handle_ovl.remote = uctx->create_pkt_ovl.handle.remote; - - /* - * Now bind the two file handles together. - */ - - hvnd_debug("BIND FILE IOCTL remote handle: %d local handle: %d\n", - uctx->create_pkt_ovl.handle.remote, - uctx->create_pkt_ovl.handle.local); - - ioctl_handle.handle = uctx->create_pkt_ovl.handle.val64; - ret = hvnd_do_ioctl(nd_dev, IOCTL_ND_PROVIDER_BIND_FILE, pr_init, - &uctx->create_pkt.handle, - &ioctl_handle, NULL, 0, false, false, 0); - - if (ret) { - ret = -EINVAL; - goto error_file_bind; - } - if (pr_init->hdr.pkt_hdr.status != 0) { - hvnd_error("Provider File bind failed; status is %d\n", - pr_init->hdr.pkt_hdr.status); - ret = -EINVAL; - goto error_file_bind; - } - if (pr_init->hdr.io_status != 0) { - hvnd_error("Provider INIT failed; io status is %d\n", - pr_init->hdr.io_status); - ret = -EINVAL; - goto error_file_bind; - } - - /* - * Now open the adaptor. - */ - - hvnd_debug("OPENING THE ADAPTOR\n"); - - pkt_type = NDV_PKT_ID1_CONTROL; - NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); - pr_o_adap->hdr.pkt_hdr.packet_type = pkt_type; - pr_o_adap->hdr.pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); - pr_o_adap->hdr.pkt_hdr.data_sz = sizeof(struct pkt_nd_open_adapter) - - sizeof(struct ndv_packet_hdr_control_1); - - pr_o_adap->hdr.pkt_hdr.status = 0; - - hvnd_debug("hdr sz is %d\n", pr_o_adap->hdr.pkt_hdr.hdr_sz); - hvnd_debug("data sz is %d\n", pr_o_adap->hdr.pkt_hdr.data_sz); - - pr_o_adap->hdr.file_handle.local = uctx->create_pkt.handle.local; - pr_o_adap->hdr.file_handle.remote = uctx->create_pkt.handle.remote; - hvnd_debug("create handle local is %x\n", - uctx->create_pkt.handle.local); - hvnd_debug("create handle remote is %x\n", - uctx->create_pkt.handle.remote); - pr_o_adap->hdr.irp_handle.val64 = 0; - - pr_o_adap->hdr.io_cntrl_code = IOCTL_ND_ADAPTER_OPEN; - pr_o_adap->hdr.output_buf_sz = pr_o_adap->hdr.pkt_hdr.data_sz - - sizeof(struct extended_data_oad); - pr_o_adap->hdr.input_buf_sz = pr_o_adap->hdr.pkt_hdr.data_sz - - sizeof(struct extended_data_oad); - - hvnd_debug("output buf sz is %d\n", pr_o_adap->hdr.output_buf_sz); - hvnd_debug("input buf sz is %d\n", pr_o_adap->hdr.input_buf_sz); - hvnd_debug("packet size is %d\n", - (int)sizeof(struct pkt_nd_open_adapter)); - - pr_o_adap->hdr.input_output_buf_offset = 0; - - - pr_o_adap->hdr.extended_data.size = sizeof(struct extended_data_oad); - pr_o_adap->hdr.extended_data.offset = - offsetof(struct pkt_nd_open_adapter, ext_data) - - sizeof(struct ndv_packet_hdr_control_1); - - hvnd_debug("size of the extended data size: %d\n", - (int)sizeof(struct extended_data_oad)); - hvnd_debug("offset of extended data: %d\n", - pr_o_adap->hdr.extended_data.offset); - - /* - * Now fill out the ioctl section. - */ - - pr_o_adap->ioctl.input.version = ND_VERSION_1; - pr_o_adap->ioctl.input.ce_mapping_cnt = - RTL_NUMBER_OF(pr_o_adap->mappings.ctx_input.mappings); - - hvnd_debug("ce_mapping cnt is %d\n", - pr_o_adap->ioctl.input.ce_mapping_cnt); - - pr_o_adap->ioctl.input.cb_mapping_offset = sizeof(union oad_ioctl); - hvnd_debug("cb_mapping offset is %d\n", - pr_o_adap->ioctl.input.cb_mapping_offset); - pr_o_adap->ioctl.input.adapter_id = (u64)nd_dev; - - pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_UAR].map_type = ND_MAP_IOSPACE; - pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_UAR].map_io_space.cache_type = ND_NON_CACHED; - pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_UAR].map_io_space.cb_length = 4096; - - pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_BF].map_type = ND_MAP_IOSPACE; - pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_BF].map_io_space.cache_type = ND_WRITE_COMBINED; - pr_o_adap->mappings.ctx_input.mappings[IBV_GET_CONTEXT_BF].map_io_space.cb_length = 4096; - - /* - * Fill in the extended data. - */ - pr_o_adap->ext_data.cnt = IBV_GET_CONTEXT_MAPPING_MAX; - - ret = hvnd_send_packet(nd_dev, pr_o_adap, - sizeof(struct pkt_nd_open_adapter), - (unsigned long)pr_o_adap, true); - if (ret) { - ret = -EINVAL; - goto error_file_bind; - } - - if (pr_o_adap->hdr.pkt_hdr.status != 0) { - hvnd_error("Open adaptor failed; status is %d\n", - pr_o_adap->hdr.pkt_hdr.status); - ret = -EINVAL; - goto error_file_bind; - } - - if (pr_o_adap->hdr.io_status != 0) { - hvnd_error("Open adaptor failed;io status is %d\n", - pr_o_adap->hdr.io_status); - ret = -EINVAL; - goto error_file_bind; - } - - /* - * Copy the necessary response from the host. - */ - - uctx->adaptor_hdl = pr_o_adap->ioctl.resrc_desc.handle; - - - hvnd_debug("adaptor handle: %p\n", (void *)uctx->adaptor_hdl); - - uctx->uar_base = - pr_o_adap->mappings.ctx_output.mapping_results[IBV_GET_CONTEXT_UAR].info; - hvnd_debug("uar base: %p\n", (void *)uctx->uar_base); - - uctx->bf_base = - pr_o_adap->mappings.ctx_output.mapping_results[IBV_GET_CONTEXT_BF].info; - hvnd_debug("bf base: %p\n", (void *)uctx->bf_base); - - uctx->bf_buf_size = - pr_o_adap->mappings.ctx_output.bf_buf_size; - hvnd_debug("bf buf size: %d\n", uctx->bf_buf_size); - - uctx->bf_offset = - pr_o_adap->mappings.ctx_output.bf_offset; - hvnd_debug("bf offset: %d\n", uctx->bf_offset); - - uctx->cqe_size = - pr_o_adap->mappings.ctx_output.cqe_size; - hvnd_debug("cqe size: %d\n", uctx->cqe_size); - - uctx->max_qp_wr = - pr_o_adap->mappings.ctx_output.max_qp_wr; - hvnd_debug("max qp wr: %d\n", uctx->max_qp_wr); - - uctx->max_sge = - pr_o_adap->mappings.ctx_output.max_sge; - hvnd_debug("max sge: %d\n", uctx->max_sge); - - uctx->max_cqe = - pr_o_adap->mappings.ctx_output.max_cqe; - hvnd_debug("max cqe: %d\n", uctx->max_cqe); - - uctx->num_qps = - pr_o_adap->mappings.ctx_output.qp_tab_size; - hvnd_debug("num qps: %d\n", uctx->num_qps); - - /* - * Now query the adaptor and stash away the adaptor info. - */ - - ret = hvnd_query_adaptor(nd_dev, uctx); - if (ret) { - hvnd_error("Query Adaptor failed; ret is %d\n", ret); - goto query_err; - } - - return ret; - -query_err: - hvnd_free_handle(nd_dev, uctx, - uctx->adaptor_hdl, - IOCTL_ND_ADAPTER_CLOSE); - - hvnd_error("Open Adaptor Failed!!\n"); - -error_file_bind: - hvnd_cleanup_file(nd_dev, uctx->file_handle_ovl.local, - uctx->file_handle_ovl.remote); - -error_pr_init: - hvnd_cleanup_file(nd_dev, uctx->file_handle.local, - uctx->file_handle.remote); - -error_cr: - if (get_uctx(nd_dev, current_pid()) != NULL) - remove_handle(nd_dev, &nd_dev->uctxidr, current_pid()); - - return ret; -} - -int hvnd_create_cq(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - struct hvnd_cq *cq) -{ - int ret; - struct pkt_nd_create_cq *pkt; - int num_pfn, num_db_pfn, num_sn_pfn; - int cq_pkt_size; - unsigned int cq_buf_size, offset; - u32 ext_data_sz; - u32 ext_data_offset; - - /* - * Now create CQ. - * First compute the number of PFNs we need to accomodate: - * One each for door bell and arm_sn and pages in cq buffer. - */ - cq_buf_size = (cq->entries * uctx->cqe_size); - offset = offset_in_page(cq->cq_buf); - num_pfn = DIV_ROUND_UP(offset + cq_buf_size, PAGE_SIZE); - - offset = offset_in_page(cq->db_addr); - num_db_pfn = DIV_ROUND_UP(offset + 8, PAGE_SIZE); - - offset = offset_in_page(&cq->arm_sn); - num_sn_pfn = DIV_ROUND_UP(offset + 4, PAGE_SIZE); - - cq_pkt_size = sizeof(struct pkt_nd_create_cq) + - (num_pfn * sizeof(u64)); - - ext_data_sz = sizeof(struct create_cq_ext_data) + - (num_pfn * sizeof(u64)); - ext_data_offset = offsetof(struct pkt_nd_create_cq, ext_data) - - sizeof(struct ndv_packet_hdr_control_1); - - hvnd_debug("CREATE CQ, num user addr pfns is %d\n", num_pfn); - hvnd_debug("CREATE CQ, num db pfns is %d\n", num_db_pfn); - - pkt = kzalloc(cq_pkt_size, GFP_KERNEL); - - if (!pkt) - return -ENOMEM; - - hvnd_init_hdr(&pkt->hdr, - (cq_pkt_size - - sizeof(struct ndv_packet_hdr_control_1)), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_CQ_CREATE, - ext_data_sz, - ext_data_offset, - 0); - - /* - * Now fill out the ioctl section. - */ - - pkt->ioctl.input.version = ND_VERSION_1; - pkt->ioctl.input.queue_depth = cq->entries; - pkt->ioctl.input.ce_mapping_cnt = MLX4_IB_CREATE_CQ_MAPPING_MAX; - pkt->ioctl.input.cb_mapping_offset = sizeof(union create_cq_ioctl); - - hvnd_debug("ce_mapping cnt is %d\n", pkt->ioctl.input.ce_mapping_cnt); - hvnd_debug("cb_mapping offset is %d\n", - pkt->ioctl.input.cb_mapping_offset); - - pkt->ioctl.input.adapter_handle = uctx->adaptor_hdl; - pkt->ioctl.input.affinity.mask = 0; - pkt->ioctl.input.affinity.group = -1; - - // 0 for usermode CQ arming - pkt->mappings.cq_in.flags = 0; - - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_BUF].map_memory.map_type = ND_MAP_MEMORY; - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_BUF].map_memory.access_type = ND_MODIFY_ACCESS; - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_BUF].map_memory.address = (u64)cq->cq_buf; - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_BUF].map_memory.cb_length = (cq->entries * uctx->cqe_size); - - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_DB].map_memory.map_type = ND_MAP_MEMORY_COALLESCE; - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_DB].map_memory.access_type = ND_WRITE_ACCESS; - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_DB].map_memory.address = (u64)cq->db_addr; - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_DB].map_memory.cb_length = 8; //size of two ints - - - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_ARM_SN].map_memory.map_type = ND_MAP_MEMORY; - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_ARM_SN].map_memory.access_type = ND_MODIFY_ACCESS; - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_ARM_SN].map_memory.address = (u64)&cq->arm_sn; - pkt->mappings.cq_in.mappings[MLX4_IB_CREATE_CQ_ARM_SN].map_memory.cb_length = 4; //size of one int - /* - * Fill in the extended data. - */ - - pkt->ext_data.cnt = 3; - pkt->ext_data.fields[MLX4_IB_CREATE_CQ_BUF].size = (sizeof(struct gpa_range) + (num_pfn * sizeof(u64))); - pkt->ext_data.fields[MLX4_IB_CREATE_CQ_BUF].offset = offsetof(struct create_cq_ext_data, cqbuf_gpa); - - pkt->ext_data.fields[MLX4_IB_CREATE_CQ_DB].size = sizeof(struct cq_db_gpa); - pkt->ext_data.fields[MLX4_IB_CREATE_CQ_DB].offset = offsetof(struct create_cq_ext_data, db_gpa); - - pkt->ext_data.fields[MLX4_IB_CREATE_CQ_ARM_SN].size = sizeof(struct cq_db_gpa); - pkt->ext_data.fields[MLX4_IB_CREATE_CQ_ARM_SN].offset = offsetof(struct create_cq_ext_data, sn_gpa); - - /* - * Fill up the gpa range for cq buffer. - */ - - pkt->ext_data.db_gpa.byte_count = 8; - pkt->ext_data.db_gpa.byte_offset = offset_in_page(cq->db_addr); - user_va_init_pfn(&pkt->ext_data.db_gpa.pfn_array[0], cq->db_umem); - - pkt->ext_data.sn_gpa.byte_count = 4; - pkt->ext_data.sn_gpa.byte_offset = offset_in_page(&cq->arm_sn); - init_pfn(&pkt->ext_data.sn_gpa.pfn_array[0], - &cq->arm_sn, - 4); - - pkt->ext_data.cqbuf_gpa.byte_count = (cq->entries * uctx->cqe_size); - pkt->ext_data.cqbuf_gpa.byte_offset = offset_in_page(cq->cq_buf); - user_va_init_pfn(&pkt->ext_data.cqbuf_gpa.pfn_array[0], cq->umem); - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt->hdr, cq_pkt_size, (u64)pkt); - - if (ret) - goto cr_cq_err; - - /* - * Copy the necessary response from the host. - */ - cq->cqn = pkt->mappings.cq_resp.cqn; - cq->cqe = pkt->mappings.cq_resp.cqe; - cq->cq_handle = pkt->ioctl.resrc_desc.handle; - - ret = insert_handle(nd_dev, &nd_dev->cqidr, cq, cq->cqn); - - if (ret) - goto cr_cq_err; - hvnd_debug("CQ create after success cqn is %d\n", cq->cqn); - hvnd_debug("CQ create after success cqe is %d\n", cq->cqe); - hvnd_debug("CQ create after success cq handle is %p\n", - (void *)cq->cq_handle); - -cr_cq_err: - kfree(pkt); - return ret; -} - -int hvnd_destroy_cq(struct hvnd_dev *nd_dev, struct hvnd_cq *cq) -{ - struct pkt_nd_free_cq free_cq_pkt; - int ret; - - // KYS try to avoid having to zero everything - memset(&free_cq_pkt, 0, sizeof(free_cq_pkt)); - hvnd_init_hdr(&free_cq_pkt.hdr, - sizeof(struct pkt_nd_free_cq) - - sizeof(struct ndv_packet_hdr_control_1), - cq->uctx->create_pkt.handle.local, - cq->uctx->create_pkt.handle.remote, - IOCTL_ND_CQ_FREE, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - free_cq_pkt.ioctl.in.version = ND_VERSION_1; - free_cq_pkt.ioctl.in.handle = cq->cq_handle; - - ret = hvnd_send_ioctl_pkt(nd_dev, &free_cq_pkt.hdr, - sizeof(struct pkt_nd_free_cq), - (u64)&free_cq_pkt); - - if (ret) - goto free_cq_err; - - remove_handle(nd_dev, &nd_dev->cqidr, cq->cqn); - - return 0; - -free_cq_err: - return ret; -} - - -int hvnd_notify_cq(struct hvnd_dev *nd_dev, struct hvnd_cq *cq, - u32 notify_type, u64 irp_handle) -{ - struct pkt_nd_notify_cq notify_cq_pkt; - int ret; - union ndv_context_handle irp_fhandle; - - irp_fhandle.local = cq->ep_object.local_irp; - - // KYS try to avoid having to zero everything - memset(¬ify_cq_pkt, 0, sizeof(notify_cq_pkt)); - hvnd_init_hdr(¬ify_cq_pkt.hdr, - sizeof(struct pkt_nd_notify_cq) - - sizeof(struct ndv_packet_hdr_control_1), - cq->uctx->create_pkt.handle.local, - cq->uctx->create_pkt.handle.remote, - IOCTL_ND_CQ_NOTIFY, 0, 0, irp_fhandle.val64); - - /* - * Now fill in the ioctl section. - */ - notify_cq_pkt.ioctl.in.version = ND_VERSION_1; - notify_cq_pkt.ioctl.in.cq_handle = cq->cq_handle; - notify_cq_pkt.ioctl.in.type = notify_type; - - - ret = hvnd_send_ioctl_pkt(nd_dev, ¬ify_cq_pkt.hdr, - sizeof(struct pkt_nd_notify_cq), - (u64)¬ify_cq_pkt); - - return ret; -} - -/* - * Memory region operations. - */ -int hvnd_cr_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 pd_handle, u64 *mr_handle) -{ - struct pkt_nd_create_mr pkt; - int ret; - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_MR_CREATE, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.version = ND_VERSION_1; - pkt.ioctl.in.handle = pd_handle; - hvnd_debug("PD handle is %p\n", (void *)pd_handle); - pkt.ioctl.in.reserved = 0; - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - /* - * Copy the handle. - */ - hvnd_debug("mr handle is %p\n", (void *)pkt.ioctl.out); - *mr_handle = pkt.ioctl.out; - - return 0; - -err: - hvnd_error("create mr failed: %d\n", ret); - return ret; - -} - -int hvnd_free_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 handle) -{ - return hvnd_free_handle(nd_dev, uctx, handle, IOCTL_ND_MR_FREE); -} - -int hvnd_deregister_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 handle) -{ - struct pkt_nd_deregister_mr pkt; - int ret; - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_MR_DEREGISTER, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.version = ND_VERSION_1; - pkt.ioctl.in.handle = handle; - pkt.ioctl.in.reserved = 0; - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - return 0; - -err: - hvnd_error("de-register mr failed: %d\n", ret); - return ret; - -} - -static inline u32 hvnd_convert_access(int acc) -{ - return (acc & IB_ACCESS_REMOTE_WRITE ? ND_MR_FLAG_ALLOW_REMOTE_WRITE : 0) | - (acc & IB_ACCESS_REMOTE_READ ? ND_MR_FLAG_ALLOW_REMOTE_READ : 0) | - (acc & IB_ACCESS_LOCAL_WRITE ? ND_MR_FLAG_ALLOW_LOCAL_WRITE : 0); -} - - -int hvnd_mr_register(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - struct hvnd_mr *mr) -{ - struct pkt_nd_register_mr pkt; - int ret; - struct hv_mpb_array *pb; - struct vmbus_packet_mpb_array *tpb; - int sz_leaf; - int num_pgs; - int i = 0; - int ext_data_sz; - u32 acc_flags; - u32 desc_size; - int pkt_type; - - /* - * The user address is passed in via a two level structure. - * An Array of struct hv_page_buffer will be used to describe - * the user memory. The pages containing this array will be descibed - * in another array of struct hv_page_buffer. We pass this seconed level - * array to the host. - */ - - hvnd_debug("ib_umem_page_count(mr->umem)=%d\n", - ib_umem_page_count(mr->umem)); - - sz_leaf = ib_umem_page_count(mr->umem) * sizeof(u64) + - sizeof(struct hv_mpb_array); - - pb = (struct hv_mpb_array *) __get_free_pages(GFP_KERNEL|__GFP_ZERO, - get_order(sz_leaf)); - - if (pb == NULL) - return -ENOMEM; - /* - * Allocate an array of hv_page_buffer to describe the first level. - */ - num_pgs = DIV_ROUND_UP(sz_leaf, PAGE_SIZE); - hvnd_debug("num pages in the top array is %d\n", num_pgs); - - desc_size = (num_pgs * sizeof(u64) + - sizeof(struct vmbus_packet_mpb_array)); - tpb = (struct vmbus_packet_mpb_array *) - __get_free_pages(GFP_KERNEL|__GFP_ZERO, get_order(desc_size)); - - if (tpb == NULL) { - free_pages((unsigned long)pb, get_order(sz_leaf)); - return -ENOMEM; - } - - hvnd_debug("sz leaf: %d; pgs in top %d\n", sz_leaf, num_pgs); - - /* - * Now fill the leaf level array. - */ - pb->len = mr->length; - pb->offset = offset_in_page(mr->start); - user_va_init_pfn(pb->pfn_array, mr->umem); - - /* - * Now fill out the top level array. - */ - for (i = 0; i < num_pgs; i++) { - tpb->range.pfn_array[i] = virt_to_phys( - (u8 *)pb + (PAGE_SIZE * i)) >> PAGE_SHIFT; - hvnd_debug("virtual address = %p\n", - (u8 *)pb + (PAGE_SIZE * i)); - hvnd_debug("physical address = %llx\n", - virt_to_phys((u8 *)pb + (PAGE_SIZE * i))); - hvnd_debug("tpb->range.pfn_array[%d]=%llx\n", i, - tpb->range.pfn_array[i]); - } - - tpb->range.offset = 8; - tpb->range.len = ib_umem_page_count(mr->umem) * sizeof(u64); - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - ext_data_sz = (ib_umem_page_count(mr->umem) * sizeof(u64)); - acc_flags = ND_MR_FLAG_DO_NOT_SECURE_VM | hvnd_convert_access(mr->acc); - hvnd_debug("memory register access flags are: %x\n", acc_flags); - - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_MR_REGISTER, 0, 0, 0); - - /* - * The memory registration call uses a different mechanism to pass - * pfn information. - */ - - pkt_type = pkt.hdr.pkt_hdr.packet_type; - NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTION_EXTERNAL_DATA); - pkt.hdr.pkt_hdr.packet_type = pkt_type; - - pkt.hdr.extended_data.size = ext_data_sz; - pkt.hdr.extended_data.offset = 0; - /* - * Now fill out the ioctl. - */ - - pkt.ioctl.in.header.version = ND_VERSION_1; - pkt.ioctl.in.header.flags = acc_flags; - pkt.ioctl.in.header.cb_length = mr->length; - pkt.ioctl.in.header.target_addr = mr->virt; - pkt.ioctl.in.header.mr_handle = mr->mr_handle; - pkt.ioctl.in.address = mr->virt; - - /* - * Now send the packet to the host. - */ - - ret = hvnd_send_pgbuf_ioctl_pkt(nd_dev, - tpb, desc_size, - &pkt.hdr, - sizeof(pkt), - (unsigned long)&pkt); - - if (ret) - goto err; - - hvnd_info("MR REGISTRATION SUCCESS\n"); - /* - * Copy the mr registration data. - */ - hvnd_debug("mr registration lkey %x\n", pkt.ioctl.out.lkey); - hvnd_debug("mr registration rkey %x\n", pkt.ioctl.out.rkey); - - mr->mr_lkey = pkt.ioctl.out.lkey; - mr->mr_rkey = pkt.ioctl.out.rkey; - - mr->ibmr.lkey = mr->mr_lkey; - mr->ibmr.rkey = be32_to_cpu(mr->mr_rkey); - hvnd_debug("ibmr registration lkey %x\n", mr->ibmr.lkey); - hvnd_debug("ibmr registration rkey %x\n", mr->ibmr.rkey); - - free_pages((unsigned long)pb, get_order(sz_leaf)); - free_pages((unsigned long)tpb, get_order(desc_size)); - - return 0; - -err: - free_pages((unsigned long)pb, get_order(sz_leaf)); - free_pages((unsigned long)tpb, get_order(desc_size)); - - hvnd_error("mr register failed: %d\n", ret); - return ret; -} - -/* - * Listener operations. - */ -int hvnd_cr_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 *listener_handle) -{ - struct pkt_nd_cr_listener pkt; - int ret; - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_LISTENER_CREATE, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.hdr.version = ND_VERSION_1; - pkt.ioctl.in.hdr.handle = uctx->adaptor_hdl; - hvnd_debug("Adaptor handle is %p\n", (void *)uctx->adaptor_hdl); - pkt.ioctl.in.hdr.reserved = 0; - pkt.ioctl.in.to_semantics = false; - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - /* - * Copy the listener handle. - */ - hvnd_debug("listener handle is %p\n", (void *)pkt.ioctl.out); - *listener_handle = pkt.ioctl.out; - - return 0; - -err: - hvnd_error("create listener failed: ret=%d uctx=%p adaptor handle=%llu\n", - ret, uctx, uctx->adaptor_hdl); - return ret; - -} - -int hvnd_free_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 listener_handle) -{ - struct pkt_nd_free_listener pkt; - int ret; - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_LISTENER_FREE, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.version = ND_VERSION_1; - pkt.ioctl.in.handle = listener_handle; - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - return 0; - -err: - hvnd_error("free listener failed: %d\n", ret); - return ret; -} - -int hvnd_bind_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 listener_handle, union nd_sockaddr_inet *addr) -{ - struct pkt_nd_bind_listener pkt; - kuid_t uid = current_uid(); - int ret; - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_LISTENER_BIND, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.hdr.version = ND_VERSION_1; - pkt.ioctl.in.hdr.handle = listener_handle; - pkt.ioctl.in.hdr.reserved = 0; - - pkt.ioctl.in.authentication_id = (u32)uid.val; - pkt.ioctl.in.is_admin = false; - - memcpy(&pkt.ioctl.in.hdr.address, addr, sizeof(*addr)); - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - return 0; - -err: - hvnd_error("bind listener failed: %d\n", ret); - return ret; -} - -int hvnd_listen_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 listener_handle, u32 backlog) -{ - struct pkt_nd_listen_listener pkt; - int ret; - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_LISTENER_LISTEN, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.version = ND_VERSION_1; - pkt.ioctl.in.listener_handle = listener_handle; - pkt.ioctl.in.back_log = backlog; - - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - return 0; - -err: - hvnd_error("listen listener failed: %d\n", ret); - return ret; -} - -int hvnd_get_addr_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 listener_handle, union nd_sockaddr_inet *addr) -{ - struct pkt_nd_get_addr_listener pkt; - int ret; - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_LISTENER_GET_ADDRESS, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.version = ND_VERSION_1; - pkt.ioctl.in.handle = listener_handle; - - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - /* - * Copy the adddress. - */ - - memcpy(addr, &pkt.ioctl.out, sizeof(union nd_sockaddr_inet)); - - return 0; - -err: - hvnd_error("listen listener failed: %d\n", ret); - return ret; -} - -int hvnd_get_connection_listener(struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx, - u64 listener_handle, u64 connector_handle, - u64 irp_handle) -{ - struct pkt_nd_get_connection_listener pkt; - int ret; - union ndv_context_handle irp_fhandle; - - ret = get_irp_handle(nd_dev, &irp_fhandle.local, (void *)irp_handle); - - if (ret) { - hvnd_error("get_irp_handle() failed: err: %d\n", ret); - return ret; - } - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST, 0, 0, - irp_fhandle.val64); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.version = ND_VERSION_1; - pkt.ioctl.in.listener_handle = listener_handle; - pkt.ioctl.in.connector_handle = connector_handle; - - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - return 0; - -err: - hvnd_error("get connection listener failed: %d\n", ret); - return ret; -} - -/* - * Connector APIs. - */ - -int hvnd_cr_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 *connector_handle) -{ - struct pkt_nd_cr_connector pkt; - int ret; - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(struct pkt_nd_cr_listener) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_CONNECTOR_CREATE, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.hdr.version = ND_VERSION_1; - pkt.ioctl.in.hdr.handle = uctx->adaptor_hdl; - pkt.ioctl.in.to_semantics = false; - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - /* - * Copy the listener handle. - */ - hvnd_debug("connector handle is %p\n", (void *)pkt.ioctl.out); - *connector_handle = pkt.ioctl.out; - - return 0; - -err: - return ret; -} - -int hvnd_free_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 handle) -{ - struct pkt_nd_free_connector pkt; - int ret; - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_CONNECTOR_FREE, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.version = ND_VERSION_1; - pkt.ioctl.in.handle = handle; - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - return 0; - -err: - return ret; -} - -int hvnd_bind_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 handle, union nd_sockaddr_inet *addr) -{ - struct pkt_nd_bind_connector pkt; - int ret; - kuid_t uid = current_uid(); - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_CONNECTOR_BIND, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.hdr.version = ND_VERSION_1; - pkt.ioctl.in.hdr.handle = handle; - - memcpy(&pkt.ioctl.in.hdr.address, addr, sizeof(*addr)); - - pkt.ioctl.in.authentication_id = (u32)uid.val; - pkt.ioctl.in.is_admin = false; - - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - return 0; - -err: - return ret; -} - -int hvnd_connector_connect(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 connector_handle, u32 in_rd_limit, u32 out_rd_limit, - u32 priv_data_length, const u8 *priv_data, - u64 qp_handle, struct if_physical_addr *phys_addr, - union nd_sockaddr_inet *dest_addr, - struct hvnd_ep_obj *ep) -{ - struct pkt_nd_connector_connect *pkt = &ep->connector_connect_pkt; - int ret; - union ndv_context_handle irp_fhandle; - - hvnd_debug("local irp is %d\n", ep->local_irp); - irp_fhandle.local = ep->local_irp; - - if (priv_data_length > MAX_PRIVATE_DATA_LEN) { - hvnd_error("priv_data_length=%d\n", priv_data_length); - return -EINVAL; - } - - // KYS try to avoid having to zero everything - memset(pkt, 0, sizeof(*pkt)); - hvnd_init_hdr(&pkt->hdr, - sizeof(*pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_CONNECTOR_CONNECT, 0, 0, irp_fhandle.val64); - - /* - * Now fill in the ioctl section. - */ - pkt->ioctl.in.hdr.version = ND_VERSION_1; - pkt->ioctl.in.hdr.connector_handle = connector_handle; - pkt->ioctl.in.hdr.read_limits.inbound = in_rd_limit; - pkt->ioctl.in.hdr.read_limits.outbound = out_rd_limit; - pkt->ioctl.in.hdr.cb_private_data_length = priv_data_length; - pkt->ioctl.in.hdr.cb_private_data_offset = - offsetof(union connector_connect_ioctl, in.priv_data); - pkt->ioctl.in.hdr.qp_handle = qp_handle; - - memcpy(&pkt->ioctl.in.hdr.phys_addr, phys_addr, - sizeof(struct if_physical_addr)); - - /* - * Luke's code does not copy the ip address. - */ - memcpy(&pkt->ioctl.in.hdr.destination_address, dest_addr, - sizeof(union nd_sockaddr_inet)); - - pkt->ioctl.in.retry_cnt = 7; - pkt->ioctl.in.rnr_retry_cnt = 7; - memcpy(pkt->ioctl.in.priv_data, priv_data, priv_data_length); - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt->hdr, sizeof(*pkt), (u64)pkt); - - if (ret) - goto err; - - return 0; - -err: - return ret; -} - -int hvnd_connector_complete_connect(struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx, - u64 connector_handle, - enum ibv_qp_state *qp_state) -{ - struct pkt_nd_connector_connect_complete pkt; - int ret; - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_CONNECTOR_COMPLETE_CONNECT, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.hdr.version = ND_VERSION_1; - pkt.ioctl.in.hdr.handle = connector_handle; - pkt.ioctl.in.rnr_nak_to = 0; - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - *qp_state = pkt.ioctl.out.state; - return 0; - -err: - return ret; -} - -int hvnd_connector_accept(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 connector_handle, - u64 qp_handle, - u32 in_rd_limit, u32 out_rd_limit, - u32 priv_data_length, const u8 *priv_data, - enum ibv_qp_state *qp_state, struct hvnd_ep_obj *ep) -{ - struct pkt_nd_connector_accept pkt; - int ret; - union ndv_context_handle irp_fhandle; - - irp_fhandle.local = ep->local_irp; - - if (priv_data_length > MAX_PRIVATE_DATA_LEN) { - hvnd_error("priv_data_length=%d\n", priv_data_length); - return -EINVAL; - } - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_CONNECTOR_ACCEPT, 0, 0, irp_fhandle.val64); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.hdr.version = ND_VERSION_1; - pkt.ioctl.in.hdr.reserved = 0; - pkt.ioctl.in.hdr.read_limits.inbound = in_rd_limit; - pkt.ioctl.in.hdr.read_limits.outbound = out_rd_limit; - pkt.ioctl.in.hdr.cb_private_data_length = priv_data_length; - - pkt.ioctl.in.hdr.cb_private_data_offset = - offsetof(struct connector_accept_in, private_data); - - pkt.ioctl.in.hdr.connector_handle = connector_handle; - pkt.ioctl.in.hdr.qp_handle = qp_handle; - - pkt.ioctl.in.rnr_nak_to = 0; - pkt.ioctl.in.rnr_retry_cnt = 7; - - - memcpy(pkt.ioctl.in.private_data, priv_data, priv_data_length); - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - *qp_state = pkt.ioctl.out.state; - return 0; - -err: - return ret; -} - -int hvnd_connector_reject(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 connector_handle, - u32 priv_data_length, u8 *priv_data, - enum ibv_qp_state *qp_state) -{ - struct pkt_nd_connector_reject pkt; - int ret; - - if (priv_data_length > MAX_PRIVATE_DATA_LEN) { - hvnd_error("priv_data_length=%d\n", priv_data_length); - return -EINVAL; - } - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_CONNECTOR_REJECT, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.hdr.version = ND_VERSION_1; - pkt.ioctl.in.hdr.reserved = 0; - pkt.ioctl.in.hdr.cb_private_data_length = priv_data_length; - - pkt.ioctl.in.hdr.cb_private_data_offset = - offsetof(struct connector_reject_in, private_data); - - pkt.ioctl.in.hdr.connector_handle = connector_handle; - - memcpy(pkt.ioctl.in.private_data, priv_data, priv_data_length); - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - *qp_state = pkt.ioctl.out.state; - return 0; - -err: - return ret; -} - -int hvnd_connector_get_rd_limits(struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx, - u64 connector_handle, - struct nd_read_limits *rd_limits) -{ - struct pkt_nd_connector_get_rd_limits pkt; - int ret; - - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_CONNECTOR_GET_READ_LIMITS, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.in.version = ND_VERSION_1; - pkt.ioctl.in.in.reserved = 0; - pkt.ioctl.in.in.handle = connector_handle; - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - *rd_limits = pkt.ioctl.out.out; - return 0; - -err: - return ret; -} - -int hvnd_connector_get_priv_data(struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx, - u64 connector_handle, - u8 *priv_data) -{ - struct pkt_nd_connector_get_priv_data pkt; - int ret; - - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.version = ND_VERSION_1; - pkt.ioctl.in.reserved = 0; - pkt.ioctl.in.handle = connector_handle; - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - memcpy(priv_data, pkt.ioctl.out, MAX_PRIVATE_DATA_LEN); - return 0; - -err: - return ret; -} - -int hvnd_connector_get_peer_addr(struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx, - u64 connector_handle, - union nd_sockaddr_inet *peer_addr) -{ - struct pkt_nd_connector_get_peer_addr pkt; - int ret; - - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.version = ND_VERSION_1; - pkt.ioctl.in.reserved = 0; - pkt.ioctl.in.handle = connector_handle; - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - memcpy(peer_addr, &pkt.ioctl.out, sizeof(union nd_sockaddr_inet)); - return 0; - -err: - return ret; -} - -int hvnd_connector_get_local_addr(struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx, - u64 connector_handle, - union nd_sockaddr_inet *addr) -{ - struct pkt_nd_connector_get_addr pkt; - int ret; - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_CONNECTOR_GET_ADDRESS, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.version = ND_VERSION_1; - pkt.ioctl.in.reserved = 0; - pkt.ioctl.in.handle = connector_handle; - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - memcpy(addr, &pkt.ioctl.out, sizeof(union nd_sockaddr_inet)); - return 0; - -err: - return ret; -} - - -int hvnd_connector_notify_disconnect(struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx, - u64 connector_handle, struct hvnd_ep_obj *ep) -{ - struct pkt_nd_connector_notify_disconnect pkt; - int ret; - union ndv_context_handle irp_fhandle; - - irp_fhandle .local = ep->local_irp; - - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT, 0, 0, - irp_fhandle.val64); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.version = ND_VERSION_1; - pkt.ioctl.in.reserved = 0; - pkt.ioctl.in.handle = connector_handle; - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - return 0; - -err: - return ret; -} - - -//ASYNCH call -int hvnd_connector_disconnect(struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx, - u64 connector_handle, struct hvnd_ep_obj *ep) -{ - struct pkt_nd_connector_disconnect pkt; - int ret; - union ndv_context_handle irp_fhandle; - - irp_fhandle.local = ep->local_irp; - - - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_CONNECTOR_DISCONNECT, 0, 0, irp_fhandle.val64); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.version = ND_VERSION_1; - pkt.ioctl.in.reserved = 0; - pkt.ioctl.in.handle = connector_handle; - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - return 0; - -err: - return ret; -} - -/* - * QP operations. - */ -int hvnd_create_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - struct hvnd_qp *qp) -{ - int ret; - struct pkt_nd_create_qp *pkt; - int num_pfn, num_db_pfn; - int qp_pkt_size; - unsigned int offset; - u32 ext_data_offset; - u32 ext_data_size; - - /* - * Now create QP. - * First compute the number of PFNs we need to accomodate: - * One each for door bell and arm_sn and pages in cq buffer. - */ - offset = offset_in_page(qp->qp_buf); - num_pfn = DIV_ROUND_UP(offset + qp->buf_size, PAGE_SIZE); - - offset = offset_in_page(qp->db_addr); - num_db_pfn = DIV_ROUND_UP(offset + 4, PAGE_SIZE); - - qp_pkt_size = sizeof(struct pkt_nd_create_qp) + - (num_pfn * sizeof(u64)); - - hvnd_debug("CREATE QP, num pfns is %d\n", num_pfn); - hvnd_debug("CREATE QP, num DB pfns is %d\n", num_db_pfn); - - pkt = kzalloc(qp_pkt_size, GFP_KERNEL); - - if (!pkt) - return -ENOMEM; - - hvnd_debug("offset of nd_create_qp is %d\n", - (int)offsetof(struct pkt_nd_create_qp, ioctl.input)); - - ext_data_offset = offsetof(struct pkt_nd_create_qp, ext_data) - - sizeof(struct ndv_packet_hdr_control_1); - - ext_data_size = sizeof(struct create_qp_ext_data) + - (num_pfn * sizeof(u64)); - - hvnd_init_hdr(&pkt->hdr, - qp_pkt_size - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_QP_CREATE, - ext_data_size, - ext_data_offset, - 0); - - /* - * Now fill out the ioctl section. - */ - - pkt->ioctl.input.hdr.version = ND_VERSION_1; - - if (qp->max_inline_data > nd_dev->query_pkt.ioctl.ad_info.inline_request_threshold) - qp->max_inline_data = nd_dev->query_pkt.ioctl.ad_info.inline_request_threshold; - pkt->ioctl.input.hdr.cb_max_inline_data = qp->max_inline_data; - - hvnd_debug("pkt->ioctl.input.hdr.cb_max_inline_data=%d\n", - pkt->ioctl.input.hdr.cb_max_inline_data); - - pkt->ioctl.input.hdr.ce_mapping_cnt = MLX4_IB_CREATE_QP_MAPPINGS_MAX; - pkt->ioctl.input.hdr.cb_mapping_offset = sizeof(union create_qp_ioctl); - - pkt->ioctl.input.hdr.initiator_queue_depth = qp->initiator_q_depth; - pkt->ioctl.input.hdr.max_initiator_request_sge = - qp->initiator_request_sge; - - hvnd_debug("recv cq handle is %p\n", (void *)qp->receive_cq_handle); - hvnd_debug("send cq handle is %p\n", (void *)qp->initiator_cq_handle); - hvnd_debug("pd handle is %p\n", (void *)qp->pd_handle); - pkt->ioctl.input.hdr.receive_cq_handle = qp->receive_cq_handle; - pkt->ioctl.input.hdr.initiator_cq_handle = qp->initiator_cq_handle; - pkt->ioctl.input.hdr.pd_handle = qp->pd_handle; - - - hvnd_debug("ce_mapping cnt is %d\n", - pkt->ioctl.input.hdr.ce_mapping_cnt); - hvnd_debug("cb_mapping offset is %d\n", - pkt->ioctl.input.hdr.cb_mapping_offset); - - pkt->ioctl.input.receive_queue_depth = qp->receive_q_depth; - pkt->ioctl.input.max_receive_request_sge = qp->receive_request_sge; - - - pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_BUF].map_memory.map_type = ND_MAP_MEMORY; - pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_BUF].map_memory.access_type = ND_MODIFY_ACCESS; - pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_BUF].map_memory.address = (u64)qp->qp_buf; - pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_BUF].map_memory.cb_length = qp->buf_size; - - pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_DB].map_memory.map_type = ND_MAP_MEMORY_COALLESCE; - pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_DB].map_memory.access_type = ND_WRITE_ACCESS; - pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_DB].map_memory.address = (u64)qp->db_addr; - pkt->mappings.qp_in.mappings[MLX4_IB_CREATE_QP_DB].map_memory.cb_length = 4; - - pkt->mappings.qp_in.log_sq_bb_count = qp->log_sq_bb_count; - pkt->mappings.qp_in.log_sq_stride = qp->log_sq_stride; - pkt->mappings.qp_in.sq_no_prefetch = qp->sq_no_prefetch; - - - /* - * Fill in the extended data. - */ - - pkt->ext_data.cnt = 2; - pkt->ext_data.fields[MLX4_IB_CREATE_QP_BUF].size = - sizeof(struct gpa_range) + (num_pfn * sizeof(u64)); - pkt->ext_data.fields[MLX4_IB_CREATE_QP_BUF].offset = - offsetof(struct create_qp_ext_data, qpbuf_gpa); - - pkt->ext_data.fields[MLX4_IB_CREATE_QP_DB].size = - sizeof(struct qp_db_gpa); - pkt->ext_data.fields[MLX4_IB_CREATE_QP_DB].offset = - offsetof(struct create_qp_ext_data, db_gpa); - - /* - * Fill up the gpa range for qp buffer. - */ - - pkt->ext_data.db_gpa.byte_count = 4; // KYS 8 or 16? - pkt->ext_data.db_gpa.byte_offset = offset_in_page(qp->db_addr); - user_va_init_pfn(&pkt->ext_data.db_gpa.pfn_array[0], qp->db_umem); - - pkt->ext_data.qpbuf_gpa.byte_count = qp->buf_size; - pkt->ext_data.qpbuf_gpa.byte_offset = offset_in_page(qp->qp_buf); - user_va_init_pfn(&pkt->ext_data.qpbuf_gpa.pfn_array[0], qp->umem); - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt->hdr, qp_pkt_size, (u64)pkt); - - if (ret) - goto cr_qp_err; - - /* - * Copy the necessary response from the host. - */ - qp->qp_handle = pkt->ioctl.resrc_desc.handle; - - qp->qpn = pkt->mappings.qp_resp.qpn; - qp->max_send_wr = pkt->mappings.qp_resp.max_send_wr; - qp->max_recv_wr = pkt->mappings.qp_resp.max_recv_wr; - qp->max_send_sge = pkt->mappings.qp_resp.max_send_sge; - qp->max_recv_sge = pkt->mappings.qp_resp.max_recv_sge; - - - hvnd_debug("qp->max_send_wr=%d max_recv_wr=%d max_send_sge=%d max_recv_sge=%d max_inline_data=%d\n", - qp->max_send_wr, qp->max_recv_wr, qp->max_send_sge, - qp->max_recv_sge, qp->max_inline_data); - - ret = insert_handle(nd_dev, &nd_dev->qpidr, qp, qp->qpn); - - if (ret) - goto cr_qp_err; - - hvnd_debug("QP create after success qpn:%d qp:%p handle:%llu\n", - qp->qpn, qp, qp->qp_handle); - -cr_qp_err: - kfree(pkt); - return ret; -} - -int hvnd_free_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - struct hvnd_qp *qp) -{ - int ret; - - ret = hvnd_free_handle(nd_dev, uctx, qp->qp_handle, IOCTL_ND_QP_FREE); - if (ret == 0) - remove_handle(nd_dev, &nd_dev->qpidr, qp->qpn); - return ret; -} - -int hvnd_flush_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - struct hvnd_qp *qp) -{ - struct pkt_nd_flush_qp pkt; - int ret; - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - IOCTL_ND_QP_FLUSH, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.version = ND_VERSION_1; - pkt.ioctl.in.reserved = 0; - pkt.ioctl.in.handle = qp->qp_handle; - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - return 0; - -err: - return ret; -} - - -int hvnd_bind_nic(struct hvnd_dev *nd_dev, bool un_bind, char *ip_addr, - char *mac_addr) -{ - int ret; - int pkt_type = NDV_PKT_ID1_BIND; - - /* - * Send the bind information over to the host. - * For now, we will have a single ip and MAC address that we - * will deal with. Down the road we will need to expand support - * for multiple IP and MAC addresses and also deal with changing - * IP addresses. - */ - - NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); - hvnd_debug("bind packet type is %d ID:%d\n", - pkt_type, NDV_PACKET_TYPE_ID(pkt_type)); - nd_dev->bind_pkt.pkt_hdr.packet_type = pkt_type; - - nd_dev->bind_pkt.pkt_hdr.hdr_sz = sizeof(struct ndv_pkt_hdr_bind_1); - hvnd_debug("bind packet size is %d\n", - (int)sizeof(struct ndv_pkt_hdr_bind_1)); - nd_dev->bind_pkt.pkt_hdr.data_sz = 0; - nd_dev->bind_pkt.unbind = un_bind; - nd_dev->bind_pkt.ip_address.address_family = AF_INET; - nd_dev->bind_pkt.ip_address.ipv4.sin_family = AF_INET; - nd_dev->bind_pkt.ip_address.ipv4.sin_port = 0; - nd_dev->bind_pkt.ip_address.ipv4.sin_addr.s_addr = *(unsigned int *)ip_addr; - - nd_dev->bind_pkt.phys_addr.length = ETH_ALEN; - memcpy(nd_dev->bind_pkt.phys_addr.addr, mac_addr, ETH_ALEN); - - /* - * This is the adapter handle; needs to be unique for each - * MAC, ip address tuple. - */ - nd_dev->bind_pkt.guest_id = (u64)nd_dev; - - ret = hvnd_send_packet(nd_dev, &nd_dev->bind_pkt, - sizeof(struct ndv_pkt_hdr_bind_1), - (u64)NULL, - true); - return ret; -} - -int hvnd_init_resources(struct hvnd_dev *nd_dev) -{ - unsigned long mmio_sz; - struct resource *resrc; - int ret = -ENOMEM; - - resrc = &iomem_resource; - - mmio_sz = nd_dev->hvdev->channel->offermsg.offer.mmio_megabytes * - 1024 * 1024; - nd_dev->mmio_sz = mmio_sz; - nd_dev->mmio_resource.name = KBUILD_MODNAME; - nd_dev->mmio_resource.flags = IORESOURCE_MEM | IORESOURCE_BUSY; - - ret = allocate_resource(resrc, &nd_dev->mmio_resource, - mmio_sz, 0, -1, mmio_sz, NULL, NULL); - - if (ret) { - hvnd_error("Unable to allocate mmio resources\n"); - return ret; - } - hvnd_debug("MMIO start is %p\n", (void *)nd_dev->mmio_resource.start); - - /* - * Send the mmio information over to the host. - */ - nd_dev->resources.pkt_hdr.packet_type = NDV_PKT_ID1_INIT_RESOURCES; - nd_dev->resources.pkt_hdr.hdr_sz = sizeof(union ndv_packet_hdr); - nd_dev->resources.pkt_hdr.data_sz = 0; - - nd_dev->resources.io_space_sz_mb = mmio_sz; - nd_dev->resources.io_space_start = nd_dev->mmio_resource.start; - - ret = hvnd_send_packet(nd_dev, &nd_dev->resources, - sizeof(struct ndv_pkt_hdr_init_resources_1), - (u64)NULL, - true); - return ret; -} - -int hvnd_query_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx) -{ - struct pkt_nd_query_adaptor *pkt; - int ret; - int pkt_type; - struct ib_device_attr *props = &nd_dev->ibdev.attrs; - struct adapter_info_v2 *adap_info = &nd_dev->query_pkt.ioctl.ad_info; - - hvnd_debug("Performing Adapter query nd_dev=%p\n", nd_dev); - - // check if there is a need to do query - if (nd_dev->query_pkt_set) - return 0; - - // need a lock, multiple process can call this at the same time - - down(&nd_dev->query_pkt_sem); - if (nd_dev->query_pkt_set) { - up(&nd_dev->query_pkt_sem); - return 0; - } - - /* - * Now query the adaptor. - */ - - pkt = &nd_dev->query_pkt; - - pkt_type = NDV_PKT_ID1_CONTROL; - NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); - - pkt->hdr.pkt_hdr.packet_type = pkt_type; - pkt->hdr.pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); - pkt->hdr.pkt_hdr.data_sz = sizeof(struct pkt_nd_query_adaptor) - - sizeof(struct ndv_packet_hdr_control_1); - - - pkt->hdr.file_handle.local = uctx->file_handle.local; - pkt->hdr.file_handle.remote = uctx->file_handle.remote; - - pkt->hdr.irp_handle.val64 = 0; - - pkt->hdr.io_cntrl_code = IOCTL_ND_ADAPTER_QUERY; - pkt->hdr.output_buf_sz = sizeof(struct nd_adap_query_ioctl); - pkt->hdr.input_buf_sz = sizeof(struct nd_adap_query_ioctl); - pkt->hdr.input_output_buf_offset = 0; - memset(&pkt->ioctl.ad_q, 0, sizeof(struct nd_adap_query_ioctl)); - - pkt->ioctl.ad_q.version = ND_VERSION_1; - pkt->ioctl.ad_q.info_version = ND_VERSION_2; - pkt->ioctl.ad_q.adapter_handle = uctx->adaptor_hdl; - - ret = hvnd_send_packet(nd_dev, pkt, - sizeof(struct pkt_nd_query_adaptor), - (unsigned long)pkt, true); - - hvnd_debug("pkt->ioctl.ad_info.inline_request_threshold=%d\n", - pkt->ioctl.ad_info.inline_request_threshold); - nd_dev->query_pkt_set = true; - - // how about host returning PENDING - up(&nd_dev->query_pkt_sem); - - if (ret) - return ret; - - hvnd_debug("Query Adaptor Succeeded\n"); - - /* - * Cache the relevant properties out. - */ - props->fw_ver = 0; - props->device_cap_flags = 0; - props->vendor_id = 0x15b3; - props->vendor_part_id = adap_info->device_id; - props->max_mr_size = ~0ull; - props->page_size_cap = PAGE_SIZE; - props->max_qp = 16384; - props->max_qp_wr = min(adap_info->max_recv_q_depth, - adap_info->max_initiator_q_depth); - props->max_sge = min(adap_info->max_initiator_sge, - adap_info->max_recv_sge); - props->max_cq = 0x1FFFF; - props->max_cqe = adap_info->max_completion_q_depth; - props->max_mr = 16384; - props->max_pd = 16384; - props->max_qp_rd_atom = adap_info->max_inbound_read_limit; - props->max_qp_init_rd_atom = adap_info->max_outbound_read_limit; - props->max_res_rd_atom = props->max_qp_rd_atom * props->max_qp; - props->max_srq = 16384; - props->max_srq_wr = adap_info->max_recv_q_depth; - props->max_srq_sge = adap_info->max_recv_sge; - return 0; -} - - -int hvnd_create_pd(struct hvnd_ucontext *uctx, struct hvnd_dev *nd_dev, - struct hvnd_ib_pd *hvnd_pd) -{ - struct pkt_nd_pd_create *pkt = &uctx->pd_cr_pkt; - int ret; - int pkt_type; - - hvnd_debug("Create Protection Domain\n"); - - pkt_type = NDV_PKT_ID1_CONTROL; - NDV_ADD_PACKET_OPTION(pkt_type, NDV_PACKET_OPTIONS_REQUIRES_PASSIVE); - - pkt->hdr.pkt_hdr.packet_type = pkt_type; - pkt->hdr.pkt_hdr.hdr_sz = sizeof(struct ndv_packet_hdr_control_1); - pkt->hdr.pkt_hdr.data_sz = sizeof(struct pkt_nd_pd_create) - - sizeof(struct ndv_packet_hdr_control_1); - - hvnd_debug("pdcreate packet size: %d\n", - (int)sizeof(struct pkt_nd_pd_create)); - hvnd_debug("pdcreate hdr size: %d\n", - (int)sizeof(struct ndv_packet_hdr_control_1)); - hvnd_debug("pdcreate data size: %d\n", pkt->hdr.pkt_hdr.data_sz); - - pkt->hdr.file_handle.local = uctx->create_pkt.handle.local; - pkt->hdr.file_handle.remote = uctx->create_pkt.handle.remote; - - hvnd_debug("create pd uctx is %p\n", uctx); - hvnd_debug("create pd local file is %d\n", - uctx->create_pkt.handle.local); - hvnd_debug("create pd local file is %d\n", - uctx->create_pkt.handle.remote); - - pkt->hdr.irp_handle.val64 = 0; - pkt->hdr.io_cntrl_code = IOCTL_ND_PD_CREATE; - - pkt->hdr.output_buf_sz = sizeof(struct nd_create_pd_ioctl); - pkt->hdr.input_buf_sz = sizeof(struct nd_create_pd_ioctl); - pkt->hdr.input_output_buf_offset = 0; - - hvnd_debug("output/input buf size: %d\n", pkt->hdr.output_buf_sz); - - /* - * Fill the ioctl section. - */ - pkt->ioctl.in.version = ND_VERSION_1; - pkt->ioctl.in.reserved = 0; - pkt->ioctl.in.handle = uctx->adaptor_hdl; - - - ret = hvnd_send_packet(nd_dev, pkt, - sizeof(struct pkt_nd_pd_create), - (unsigned long)pkt, true); - - if (ret) - return ret; - - if (pkt->hdr.pkt_hdr.status != 0) { - hvnd_error("Create PD failed; status is %d\n", - pkt->hdr.pkt_hdr.status); - return -EINVAL; - } - if (pkt->hdr.io_status != 0) { - hvnd_error("Create PD failed;io status is %d\n", - pkt->hdr.io_status); - return -EINVAL; - } - - hvnd_debug("Create PD Succeeded\n"); - - hvnd_debug("pd_handle is %p\n", (void *)pkt->ioctl.resp.pd_handle); - hvnd_debug("pdn is %d\n", (int)pkt->ioctl.resp.pdn); - - hvnd_pd->pdn = pkt->ioctl.resp.pdn; - hvnd_pd->handle = pkt->ioctl.out_handle; - - return 0; -} - -int hvnd_cancel_io(struct hvnd_ep_obj *ep_object) -{ - struct pkt_nd_cancel_io pkt; - int ret; - u32 ioctl; - - switch (ep_object->type) { - case ND_LISTENER: - hvnd_debug("LISTENER I/O Cancelled\n"); - ioctl = IOCTL_ND_LISTENER_CANCEL_IO; - break; - case ND_CONNECTOR: - hvnd_debug("CONNECTOR I/O Cancelled\n"); - ioctl = IOCTL_ND_CONNECTOR_CANCEL_IO; - break; - case ND_MR: - hvnd_debug("MR I/O Cancelled\n"); - ioctl = IOCTL_ND_MR_CANCEL_IO; - break; - case ND_CQ: - hvnd_debug("CQ I/O Cancelled\n"); - ioctl = IOCTL_ND_CQ_CANCEL_IO; - break; - default: - hvnd_error("UNKNOWN object type\n"); - return -EINVAL; - } - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - ep_object->uctx->create_pkt.handle.local, - ep_object->uctx->create_pkt.handle.remote, - ioctl, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.version = ND_VERSION_1; - pkt.ioctl.in.reserved = 0; - pkt.ioctl.in.handle = ep_object->ep_handle; - hvnd_debug("cancel io handle is %p\n", (void *)ep_object->ep_handle); - - ret = hvnd_send_ioctl_pkt(ep_object->nd_dev, &pkt.hdr, - sizeof(pkt), - (u64)&pkt); - - if (ret) - goto err; - - /* - * Now that we have cancelled all I/Os, - */ - - return 0; - -err: - hvnd_error("cancel I/O operation failed\n"); - return ret; -} - - -int hvnd_free_handle(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 handle, u32 ioctl) -{ - struct pkt_nd_free_handle pkt; - int ret; - - hvnd_debug("Freeing handle ioctl is %s; handle is %p\n", - hvnd_get_op_name(ioctl), (void *)handle); - - hvnd_debug("uctx is %p\n", uctx); - hvnd_debug("nd_dev is %p\n", nd_dev); - - // KYS try to avoid having to zero everything - memset(&pkt, 0, sizeof(pkt)); - hvnd_init_hdr(&pkt.hdr, - sizeof(pkt) - - sizeof(struct ndv_packet_hdr_control_1), - uctx->create_pkt.handle.local, - uctx->create_pkt.handle.remote, - ioctl, 0, 0, 0); - - /* - * Now fill in the ioctl section. - */ - pkt.ioctl.in.version = ND_VERSION_1; - pkt.ioctl.in.reserved = 0; - pkt.ioctl.in.handle = handle; - - ret = hvnd_send_ioctl_pkt(nd_dev, &pkt.hdr, sizeof(pkt), (u64)&pkt); - - if (ret) - goto err; - - return 0; - -err: - hvnd_error("%s: ret=%d\n", __func__, ret); - return ret; -} - -int hvnd_negotiate_version(struct hvnd_dev *nd_dev) -{ - union ndv_packet_init *pkt = &nd_dev->init_pkt; - int ret; - - nd_dev->negotiated_version = NDV_PROTOCOL_VAERSION_INVALID; - - pkt->packet_type = NDV_PACKET_TYPE_INIT; - pkt->protocol_version = NDV_PROTOCOL_VERSION_CURRENT; - pkt->flags = 0; // KYS are the flags 0? - - ret = hvnd_send_packet(nd_dev, pkt, - sizeof(union ndv_packet_init), (u64)NULL, true); - - return ret; -} - -void hvnd_callback(void *context) -{ - struct hv_device *dev = context; - struct hvnd_dev *nd_dev = hv_get_drvdata(dev); - int copy_sz = 0; - struct ndv_packet_hdr_control_1 *ctrl_hdr; - union ndv_packet_init *pkt_init; - u32 recvlen; - u32 local_irp; - u64 requestid; - u32 *pkt_type; - u32 pkt_id; - struct hvnd_ep_obj *ep_object; - struct incoming_pkt *incoming_pkt; /* Used only for asynch calls */ - char *incoming_pkt_start; - struct vmpacket_descriptor *desc; - int status; - struct hvnd_cookie *hvnd_cookie; - unsigned long flags; - - vmbus_recvpacket_raw(dev->channel, hvnd_recv_buffer, - (PAGE_SIZE * 4), &recvlen, &requestid); - - if (recvlen <= 0) - return; - - desc = (struct vmpacket_descriptor *)hvnd_recv_buffer; - incoming_pkt_start = hvnd_recv_buffer + (desc->offset8 << 3); - recvlen -= desc->offset8 << 3; - - pkt_type = (u32 *)incoming_pkt_start; - pkt_id = *pkt_type; - if (pkt_id != NDV_PACKET_TYPE_INIT) - pkt_id = NDV_PACKET_TYPE_ID(pkt_id); - - switch (pkt_id) { - case NDV_PACKET_TYPE_INIT: - /* - * Host is responding to our init packet. - */ - pkt_init = (union ndv_packet_init *)incoming_pkt_start; - nd_dev->negotiated_version = pkt_init->protocol_version; - copy_sz = 0; - break; - - case NDV_PKT_ID1_INIT_RESOURCES: - copy_sz = 0; - break; - - case NDV_PKT_ID1_BIND: - nd_dev->bind_pkt.pkt_hdr.status = - ((union ndv_packet_hdr *) incoming_pkt_start)->status; - copy_sz = 0; - break; - - case NDV_PKT_ID1_COMPLETE: - ctrl_hdr = (struct ndv_packet_hdr_control_1 *)incoming_pkt_start; - status = ctrl_hdr->io_status; - - local_irp = ctrl_hdr->irp_handle.local; - ep_object = (struct hvnd_ep_obj *)map_irp_to_ctx(nd_dev, - local_irp); - - if (!ep_object) { - hvnd_error("irp could not be mapped; irp is %d ioctl is %s", - local_irp, - hvnd_get_op_name(ctrl_hdr->io_cntrl_code)); - goto complete; - } - - if (ctrl_hdr->io_cntrl_code != IOCTL_ND_CQ_NOTIFY) - hvnd_debug("completion packet; iostatus is %x, ioctl is %s", - ctrl_hdr->io_status, - hvnd_get_op_name(ctrl_hdr->io_cntrl_code)); - - switch (ctrl_hdr->io_cntrl_code) { - - case IOCTL_ND_CQ_NOTIFY: - hvnd_process_cq_event_complete(ep_object, status); - - ep_del_work_pending(ep_object); - goto complete; - - case IOCTL_ND_CONNECTOR_ACCEPT: - - hvnd_process_connector_accept(ep_object, status); - - ep_del_work_pending(ep_object); - goto complete; - - case IOCTL_ND_CONNECTOR_DISCONNECT: - hvnd_debug("disconnected: ep opj is %p; status: %d\n", - ep_object, status); - hvnd_process_disconnect(ep_object, status); - - ep_del_work_pending(ep_object); - goto complete; - - default: - break; - } - - /* - * This is the completion notification; - * the IRP cookie is the state through which - * we will invoke the callback. - */ - incoming_pkt = kmalloc(recvlen + sizeof(struct incoming_pkt), - GFP_ATOMIC); - if (incoming_pkt == NULL) { - hvnd_error("Could not alloc memory in callback\n"); - ep_del_work_pending(ep_object); - goto complete; - } - memcpy(incoming_pkt->pkt, incoming_pkt_start, recvlen); - - spin_lock_irqsave(&ep_object->incoming_pkt_list_lock, flags); - list_add_tail(&incoming_pkt->list_entry, - &ep_object->incoming_pkt_list); - spin_unlock_irqrestore(&ep_object->incoming_pkt_list_lock, - flags); - - schedule_work(&ep_object->wrk.work); - - goto complete; - - case NDV_PKT_ID1_CREATE: - copy_sz = sizeof(struct ndv_pkt_hdr_create_1); - break; - - case NDV_PKT_ID1_CLEANUP: - copy_sz = sizeof(struct ndv_pkt_hdr_cleanup_1); - break; - - case NDV_PKT_ID1_CONTROL: - ctrl_hdr = (struct ndv_packet_hdr_control_1 *)incoming_pkt_start; - status = ctrl_hdr->io_status; - - if (ctrl_hdr->io_cntrl_code != IOCTL_ND_CQ_NOTIFY) - hvnd_debug("packet; iostatus is %x ioctl is %s", - ctrl_hdr->io_status, hvnd_get_op_name(ctrl_hdr->io_cntrl_code)); - - switch (ctrl_hdr->io_cntrl_code) { - - case IOCTL_ND_PROVIDER_INIT: - copy_sz = sizeof(struct pkt_nd_provider_ioctl); - break; - - case IOCTL_ND_PROVIDER_BIND_FILE: - copy_sz = sizeof(struct pkt_nd_provider_ioctl); - break; - - case IOCTL_ND_ADAPTER_OPEN: - copy_sz = sizeof(struct pkt_nd_open_adapter); - break; - - case IOCTL_ND_ADAPTER_CLOSE: - copy_sz = sizeof(struct pkt_nd_free_handle); - break; - - case IOCTL_ND_ADAPTER_QUERY: - copy_sz = sizeof(struct pkt_nd_query_adaptor); - break; - - case IOCTL_ND_PD_CREATE: - copy_sz = sizeof(struct pkt_nd_pd_create); - break; - - case IOCTL_ND_PD_FREE: - copy_sz = sizeof(struct pkt_nd_free_handle); - break; - - case IOCTL_ND_CQ_CREATE: - copy_sz = sizeof(struct pkt_nd_create_cq); - break; - - case IOCTL_ND_CQ_FREE: - copy_sz = sizeof(struct pkt_nd_free_cq); - break; - - case IOCTL_ND_CQ_NOTIFY: //FIXME check ep stop state - local_irp = ctrl_hdr->irp_handle.local; - ep_object = (struct hvnd_ep_obj *) - map_irp_to_ctx(nd_dev, local_irp); - if (!ep_object) { - hvnd_error("irp could not be mapped\n"); - goto complete; - return; - } - copy_sz = sizeof(struct pkt_nd_notify_cq); - hvnd_process_cq_event_pending(ep_object, status); - goto complete; - return; - - case IOCTL_ND_LISTENER_CREATE: - copy_sz = sizeof(struct pkt_nd_cr_listener); - break; - - case IOCTL_ND_LISTENER_FREE: - copy_sz = sizeof(struct pkt_nd_free_listener); - break; - - case IOCTL_ND_QP_FREE: - copy_sz = sizeof(struct pkt_nd_free_handle); - break; - - case IOCTL_ND_CONNECTOR_CANCEL_IO: - case IOCTL_ND_MR_CANCEL_IO: - case IOCTL_ND_CQ_CANCEL_IO: - case IOCTL_ND_LISTENER_CANCEL_IO: - copy_sz = sizeof(struct pkt_nd_cancel_io); - break; - - case IOCTL_ND_LISTENER_BIND: - copy_sz = sizeof(struct pkt_nd_bind_listener); - break; - - case IOCTL_ND_LISTENER_LISTEN: - copy_sz = sizeof(struct pkt_nd_listen_listener); - break; - - case IOCTL_ND_LISTENER_GET_ADDRESS: - copy_sz = sizeof(struct pkt_nd_get_addr_listener); - break; - - case IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST: - copy_sz = sizeof(struct pkt_nd_get_connection_listener); - goto complete; // non-block - - case IOCTL_ND_CONNECTOR_CREATE: - copy_sz = sizeof(struct pkt_nd_cr_connector); - break; - - case IOCTL_ND_CONNECTOR_FREE: - copy_sz = sizeof(struct pkt_nd_free_connector); - break; - - case IOCTL_ND_CONNECTOR_BIND: - copy_sz = sizeof(struct pkt_nd_free_connector); - break; - - // KYS: ALERT: ASYNCH Operation - case IOCTL_ND_CONNECTOR_CONNECT: - copy_sz = sizeof(struct pkt_nd_connector_connect); - goto complete; //non-block - - case IOCTL_ND_CONNECTOR_COMPLETE_CONNECT: - copy_sz = sizeof(struct pkt_nd_connector_connect_complete); - break; - - // KYS: ALERT: ASYNCH Operation - case IOCTL_ND_CONNECTOR_ACCEPT: - copy_sz = sizeof(struct pkt_nd_connector_accept); - goto complete; //non-block - - case IOCTL_ND_CONNECTOR_REJECT: - copy_sz = sizeof(struct pkt_nd_connector_reject); - break; - - case IOCTL_ND_CONNECTOR_GET_READ_LIMITS: - copy_sz = sizeof(struct pkt_nd_connector_get_rd_limits); - break; - - case IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA: - copy_sz = sizeof(struct pkt_nd_connector_get_priv_data); - break; - - case IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS: - copy_sz = sizeof(struct pkt_nd_connector_get_peer_addr); - break; - - case IOCTL_ND_CONNECTOR_GET_ADDRESS: - copy_sz = sizeof(struct pkt_nd_connector_get_addr); - break; - - // KYS: ALERT: ASYNCH Operation - case IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT: - copy_sz = sizeof(struct pkt_nd_connector_notify_disconnect); - goto complete; //non-block - - // KYS: ALERT: ASYNCH Operation - case IOCTL_ND_CONNECTOR_DISCONNECT: - hvnd_debug("IOCTL_ND_CONNECTOR_DISCONNECT\n"); - copy_sz = sizeof(struct pkt_nd_connector_notify_disconnect); - goto complete; // non-block - - case IOCTL_ND_QP_CREATE: - copy_sz = sizeof(struct pkt_nd_create_qp); - break; - - case IOCTL_ND_MR_CREATE: - copy_sz = sizeof(struct pkt_nd_create_mr); - break; - - case IOCTL_ND_MR_FREE: - copy_sz = sizeof(struct pkt_nd_free_handle); - break; - - case IOCTL_ND_MR_REGISTER: - copy_sz = sizeof(struct pkt_nd_register_mr); - break; - - case IOCTL_ND_MR_DEREGISTER: - copy_sz = sizeof(struct pkt_nd_deregister_mr); - break; - - case IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST: - copy_sz = sizeof(struct pkt_query_addr_list); - break; - - case IOCTL_ND_QP_FLUSH: - copy_sz = sizeof(struct pkt_nd_flush_qp); - break; - - default: - hvnd_warn("Got unknown ioctl: %d\n", - ctrl_hdr->io_cntrl_code); - copy_sz = 0; - break; - } - - break; - default: - hvnd_warn("Got an unknown packet type %d\n", *pkt_type); - break; - } - - hvnd_cookie = (struct hvnd_cookie *)requestid; - memcpy(hvnd_cookie->pkt, incoming_pkt_start, copy_sz); - complete(&hvnd_cookie->host_event); - -complete: - /* send out ioctl completion patcket */ - if (desc->flags & VMBUS_DATA_PACKET_FLAG_COMPLETION_REQUESTED) { - int retry = 5; - - while (true) { - int ret; - - ret = vmbus_sendpacket(dev->channel, NULL, 0, - requestid, VM_PKT_COMP, 0); - if (ret == 0) { - break; - } else if (ret == -EAGAIN) { - if (--retry == 0) { - hvnd_error("give up retrying send completion packet\n"); - break; - } - hvnd_warn("retrying send completion packet\n"); - udelay(100); - } else { - hvnd_error("unable to send completion packet ret=%d\n", - ret); - break; - } - } - } - -} diff --git a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/vmbus_rdma.h b/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/vmbus_rdma.h deleted file mode 100644 index 78e534fa8748..000000000000 --- a/drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/vmbus_rdma.h +++ /dev/null @@ -1,2202 +0,0 @@ -/* - * Copyright (c) 2014, Microsoft Corporation. - * - * Author: - * K. Y. Srinivasan <kys@microsoft.com> - * - * This program is free software; you can redistribute it and/or modify it - * under the terms of the GNU General Public License version 2 as published - * by the Free Software Foundation. - * - * This program is distributed in the hope that it will be useful, but - * WITHOUT ANY WARRANTY; without even the implied warranty of - * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or - * NON INFRINGEMENT. See the GNU General Public License for more - * details. - * - * Bug fixes/enhancements: Long Li <longli@microsoft.com> - */ - - -#ifndef _VMBUS_RDMA_H -#define _VMBUS_RDMA_H - - -#include <linux/in.h> -#include <linux/in6.h> -#include <rdma/ib_verbs.h> -#include <linux/idr.h> -#include <linux/if_ether.h> - -/* NetworkDirect version Numbers. - */ -#define ND_VERSION_1 0x1 -#define ND_VERSION_2 0x20000 - -#ifndef NDVER -#define NDVER ND_VERSION_2 -#endif - -#define ND_ADAPTER_FLAG_IN_ORDER_DMA_SUPPORTED 0x00000001 -#define ND_ADAPTER_FLAG_CQ_INTERRUPT_MODERATION_SUPPORTED 0x00000004 -#define ND_ADAPTER_FLAG_MULTI_ENGINE_SUPPORTED 0x00000008 -#define ND_ADAPTER_FLAG_CQ_RESIZE_SUPPORTED 0x00000100 -#define ND_ADAPTER_FLAG_LOOPBACK_CONNECTIONS_SUPPORTED 0x00010000 - -#define ND_CQ_NOTIFY_ERRORS 0 -#define ND_CQ_NOTIFY_ANY 1 -#define ND_CQ_NOTIFY_SOLICITED 2 - -#define ND_MR_FLAG_ALLOW_LOCAL_WRITE 0x00000001 -#define ND_MR_FLAG_ALLOW_REMOTE_READ 0x00000002 -#define ND_MR_FLAG_ALLOW_REMOTE_WRITE 0x00000005 -#define ND_MR_FLAG_RDMA_READ_SINK 0x00000008 -#define ND_MR_FLAG_DO_NOT_SECURE_VM 0x80000000 - -#define ND_OP_FLAG_SILENT_SUCCESS 0x00000001 -#define ND_OP_FLAG_READ_FENCE 0x00000002 -#define ND_OP_FLAG_SEND_AND_SOLICIT_EVENT 0x00000004 -#define ND_OP_FLAG_ALLOW_READ 0x00000008 -#define ND_OP_FLAG_ALLOW_WRITE 0x00000010 - -#if NDVER >= ND_VERSION_2 -#define ND_OP_FLAG_INLINE 0x00000020 -#endif - -#define ND_AF_INET6 23 -#define IF_MAX_ADDR_LENGTH 32 - -struct group_affinity { - u64 mask; //KYS: usually 0 - u16 group; // KYS usually -1 - u16 reserved[3]; -}; - -struct if_physical_addr { - u16 length; - u8 addr[IF_MAX_ADDR_LENGTH]; -}; - -struct adapter_info_v2 { - u32 info_version; - u16 vendor_id; - u16 device_id; - u64 adapter_id; - size_t max_registration_size; - size_t max_window_size; - u32 max_initiator_sge; - u32 max_recv_sge; - u32 max_read_sge; - u32 max_transfer_length; - u32 max_inline_data_size; - u32 max_inbound_read_limit; - u32 max_outbound_read_limit; - u32 max_recv_q_depth; - u32 max_initiator_q_depth; - u32 max_shared_recv_q_depth; - u32 max_completion_q_depth; - u32 inline_request_threshold; - u32 large_request_threshold; - u32 max_caller_data; - u32 max_callee_data; - u32 adapter_flags; -} __packed; - -struct nd2_adapter_info_32 { //KYS: Check what this is - u32 info_version; - u16 vendor_id; - u16 devic_id; - u64 adapter_id; - u32 max_registration_size; - u32 max_window_size; - u32 max_initiator_sge; - u32 max_recv_sge; - u32 max_read_sge; - u32 max_transfer_length; - u32 max_inline_data_size; - u32 max_inbound_read_limit; - u32 max_outbound_read_limit; - u32 max_recv_q_depth; - u32 max_initiator_q_depth; - u32 max_shared_recv_q_depth; - u32 max_completion_q_depth; - u32 inline_request_threshold; - u32 large_request_threshold; - u32 max_caller_data; - u32 max_callee_data; - u32 adapter_flags; -} __packed; - -enum nd2_request_type { - ND2_RT_RECEIVE, - ND2_RT_SEND, - ND2_RT_BIND, - ND2_RT_INVALIDATE, - ND2_RT_READ, - ND2_RT_WRITE -}; - -struct nd2_result { - u32 status; - u32 bytes_transferred; - void *qp_ctx; - void *request_ctx; - enum nd2_request_type request_type; -} __packed; - -struct nd2_sge { - void *buffer; - u32 buffer_length; - u32 mr_token; -} __packed; - -/* - * The communication with the host via ioctls using VMBUS - * as the transport. - */ - -#define ND_IOCTL_VERSION 1 - -enum nd_mapping_type { - ND_MAP_IOSPACE, - ND_MAP_MEMORY, - ND_MAP_MEMORY_COALLESCE, - ND_MAP_PAGES, - ND_MAP_PAGES_COALLESCE, - ND_UNMAP_IOSPACE, - ND_UNMAP_MEMORY, - ND_MAX_MAP_TYPE -}; - -enum nd_caching_type { - ND_NON_CACHED = 0, - ND_CACHED, - ND_WRITE_COMBINED, - ND_MAX_CACHE_TYPE -}; - -enum nd_aceess_type { - ND_READ_ACCESS = 0, - ND_WRITE_ACCESS, - ND_MODIFY_ACCESS -}; - -struct nd_map_io_space { - enum nd_mapping_type map_type; - enum nd_caching_type cache_type; - u32 cb_length; -}; - -struct nd_map_memory { - enum nd_mapping_type map_type; - enum nd_aceess_type access_type; - u64 address; - u32 cb_length; -}; - -struct nd_mapping_id { - enum nd_mapping_type map_type; - u64 id; -}; - -struct ndk_map_pages { - struct nd_map_memory header; - u32 page_offset; -}; - -union nd_mapping { - enum nd_mapping_type map_type; - struct nd_map_io_space map_io_space; - struct nd_map_memory map_memory; - struct nd_mapping_id mapping_id; - struct ndk_map_pages map_pages; -}; - -struct nd_mapping_result { - u64 id; - u64 info; -}; - -struct nd_resource_descriptor { - u64 handle; - u32 ce_mapping_results; - u32 cb_mapping_results_offset; -}; - -struct nd_handle { - u32 version; - u32 reserved; - u64 handle; -}; - -union nd_sockaddr_inet { - struct sockaddr_in ipv4; - struct sockaddr_in6 ipv6; - u16 address_family; //KYS how is this supposed to work? -}; - -struct nd_address_element { - union nd_sockaddr_inet addr; - char mac_addr[ETH_ALEN]; -}; - -struct nd_resolve_address { - u32 version; - u32 reserved; - union nd_sockaddr_inet address; -}; - -struct nd_open_adapter { - u32 version; - u32 reserved; - u32 ce_mapping_cnt; - u32 cb_mapping_offset; - u64 adapter_id; -}; - -struct nd_adapter_query { - u32 version; - u32 info_version; - u64 adapter_handle; -}; - -struct nd_create_cq { - u32 version; - u32 queue_depth; - u32 ce_mapping_cnt; - u32 cb_mapping_offset; - u64 adapter_handle; - struct group_affinity affinity; -}; - -struct nd_create_srq { - u32 version; - u32 queue_depth; - u32 ce_mapping_cnt; - u32 cb_mapping_offset; - u32 max_request_sge; - u32 notify_threshold; - u64 pd_handle; - struct group_affinity affinity; -}; - -struct nd_create_qp_hdr { - u32 version; - u32 cb_max_inline_data; - u32 ce_mapping_cnt; - u32 cb_mapping_offset; //KYS: what is this prefix - ce/cb - u32 initiator_queue_depth; - u32 max_initiator_request_sge; - u64 receive_cq_handle; - u64 initiator_cq_handle; - u64 pd_handle; -}; - -struct nd_create_qp { - struct nd_create_qp_hdr hdr; - u32 receive_queue_depth; - u32 max_receive_request_sge; -}; - -struct nd_create_qp_with_srq { - struct nd_create_qp_hdr header; - u64 srq_handle; -}; - -struct nd_srq_modify { - u32 version; - u32 queue_depth; - u32 ce_mapping_cnt; - u32 cb_mapping_offset; - u32 notify_threshold; - u32 reserved; - u64 srq_handle; -}; - -struct nd_cq_modify { - u32 version; - u32 queue_depth; - u32 ce_mapping_count; - u32 cb_mappings_offset; - u64 cq_handle; -}; - -struct nd_cq_notify { - u32 version; - u32 type; - u64 cq_handle; -}; - -struct nd_mr_register_hdr { - u32 version; - u32 flags; - u64 cb_length; - u64 target_addr; - u64 mr_handle; -}; - -struct nd_mr_register { - struct nd_mr_register_hdr header; - u64 address; -}; - -struct nd_bind { - u32 version; - u32 reserved; - u64 handle; - union nd_sockaddr_inet address; -}; - -struct nd_read_limits { - u32 inbound; - u32 outbound; -}; - -struct nd_connect { - u32 version; - u32 reserved; - struct nd_read_limits read_limits; - u32 cb_private_data_length; - u32 cb_private_data_offset; - u64 connector_handle; - u64 qp_handle; - union nd_sockaddr_inet destination_address; - struct if_physical_addr phys_addr; -}; - -struct nd_accept { - u32 version; - u32 reserved; - struct nd_read_limits read_limits; - u32 cb_private_data_length; - u32 cb_private_data_offset; - u64 connector_handle; - u64 qp_handle; -}; - -struct nd_reject { - u32 version; - u32 reserved; - u32 cb_private_data_length; - u32 cb_private_data_offset; - u64 connector_handle; -}; - -struct nd_listen { - u32 version; - u32 back_log; - u64 listener_handle; -}; - -struct nd_get_connection_request { - u32 version; - u32 reserved; - u64 listener_handle; - u64 connector_handle; -}; - -enum ndv_mmio_type { - ND_PARTITION_KERNEL_VIRTUAL, - ND_PARTITION_SYSTEM_PHYSICAL, - ND_PARTITION_GUEST_PHYSICAL, - ND_MAXIMUM_MMIO_TYPE -}; - -struct ndv_resolve_adapter_id { - u32 version; - struct if_physical_addr phys_addr; -}; - -struct ndv_partition_create { - u32 version; - enum ndv_mmio_type mmio_type; - u64 adapter_id; - u64 xmit_cap; -}; - -struct ndv_partition_bind_luid { - u32 version; - u32 reserved; - u64 partition_handle; - struct if_physical_addr phys_addr; - //IF_LUID luid; //KYS? -}; - -struct ndv_partition_bind_address { - u32 version; - u32 reserved; - u64 partition_handle; - union nd_sockaddr_inet address; - struct if_physical_addr guest_phys_addr; - struct if_physical_addr phys_addr; -}; - -struct ndk_mr_register { - struct nd_mr_register_hdr hdr; - u32 cb_logical_page_addresses_offset; -}; - -struct ndk_bind { - struct nd_bind hdr; - u64 authentication_id; - bool is_admin; -}; - -#define FDN 0x12 -#define METHOD_BUFFERED 0x0 -#define FAA 0x0 - -#define CTL_CODE( DeviceType, Function, Method, Access ) ( \ - ((DeviceType) << 16) | ((Access) << 14) | ((Function) << 2) | (Method) \ -) - -#define ND_FUNCTION(r_, i_) ((r_) << 6 | (i_)) -#define IOCTL_ND(r_, i_) \ - CTL_CODE( FDN, ND_FUNCTION((r_), (i_)), METHOD_BUFFERED, FAA ) - -#define ND_FUNCTION_FROM_CTL_CODE(ctrlCode_) ((ctrlCode_ >> 2) & 0xFFF) -#define ND_RESOURCE_FROM_CTL_CODE(ctrlCode_) (ND_FUNCTION_FROM_CTL_CODE(ctrlCode_) >> 6) -#define ND_OPERATION_FROM_CTRL_CODE(ctrlCode_) (ND_FUNCTION_FROM_CTL_CODE(ctrlCode_) & 0x3F) - -#define ND_DOS_DEVICE_NAME L"\\DosDevices\\Global\\NetworkDirect" -#define ND_WIN32_DEVICE_NAME L"\\\\.\\NetworkDirect" - -enum nd_resource_type { - ND_PROVIDER = 0, - ND_ADAPTER, - ND_PD, - ND_CQ, - ND_MR, - ND_MW, - ND_SRQ, - ND_CONNECTOR, - ND_LISTENER, - ND_QP, - ND_VIRTUAL_PARTITION, - ND_RESOURCE_TYPE_COUNT -}; - -#define ND_OPERATION_COUNT 14 - -#define IOCTL_ND_PROVIDER(i_) IOCTL_ND(ND_PROVIDER, i_) -#define IOCTL_ND_ADAPTER(i_) IOCTL_ND(ND_ADAPTER, i_) -#define IOCTL_ND_PD(i_) IOCTL_ND(ND_PD, i_) -#define IOCTL_ND_CQ(i_) IOCTL_ND(ND_CQ, i_) -#define IOCTL_ND_MR(i_) IOCTL_ND(ND_MR, i_) -#define IOCTL_ND_MW(i_) IOCTL_ND(ND_MW, i_) -#define IOCTL_ND_SRQ(i_) IOCTL_ND(ND_SRQ, i_) -#define IOCTL_ND_CONNECTOR(i_) IOCTL_ND(ND_CONNECTOR, i_) -#define IOCTL_ND_LISTENER(i_) IOCTL_ND(ND_LISTENER, i_) -#define IOCTL_ND_QP(i_) IOCTL_ND(ND_QP, i_) -#define IOCTL_ND_VIRTUAL_PARTITION(i_) IOCTL_ND(ND_VIRTUAL_PARTITION, i_) - -/* Provider IOCTLs */ -#define IOCTL_ND_PROVIDER_INIT IOCTL_ND_PROVIDER( 0 ) -#define IOCTL_ND_PROVIDER_BIND_FILE IOCTL_ND_PROVIDER( 1 ) -#define IOCTL_ND_PROVIDER_QUERY_ADDRESS_LIST IOCTL_ND_PROVIDER( 2 ) -#define IOCTL_ND_PROVIDER_RESOLVE_ADDRESS IOCTL_ND_PROVIDER( 3 ) -#define IOCTL_ND_PROVIDER_MAX_OPERATION 4 - -/* Adapter IOCTLs */ -#define IOCTL_ND_ADAPTER_OPEN IOCTL_ND_ADAPTER( 0 ) -#define IOCTL_ND_ADAPTER_CLOSE IOCTL_ND_ADAPTER( 1 ) -#define IOCTL_ND_ADAPTER_QUERY IOCTL_ND_ADAPTER( 2 ) -#define IOCTL_ND_ADAPTER_QUERY_ADDRESS_LIST IOCTL_ND_ADAPTER( 3 ) -#define IOCTL_ND_ADAPTER_MAX_OPERATION 4 - -/* Protection Domain IOCTLs */ -#define IOCTL_ND_PD_CREATE IOCTL_ND_PD( 0 ) -#define IOCTL_ND_PD_FREE IOCTL_ND_PD( 1 ) -#define IOCTL_ND_PD_MAX_OPERATION 2 - -/* Completion Queue IOCTLs */ -#define IOCTL_ND_CQ_CREATE IOCTL_ND_CQ( 0 ) -#define IOCTL_ND_CQ_FREE IOCTL_ND_CQ( 1 ) -#define IOCTL_ND_CQ_CANCEL_IO IOCTL_ND_CQ( 2 ) -#define IOCTL_ND_CQ_GET_AFFINITY IOCTL_ND_CQ( 3 ) -#define IOCTL_ND_CQ_MODIFY IOCTL_ND_CQ( 4 ) -#define IOCTL_ND_CQ_NOTIFY IOCTL_ND_CQ( 5 ) -#define IOCTL_ND_CQ_MAX_OPERATION 6 - -/* Memory Region IOCTLs */ -#define IOCTL_ND_MR_CREATE IOCTL_ND_MR( 0 ) -#define IOCTL_ND_MR_FREE IOCTL_ND_MR( 1 ) -#define IOCTL_ND_MR_CANCEL_IO IOCTL_ND_MR( 2 ) -#define IOCTL_ND_MR_REGISTER IOCTL_ND_MR( 3 ) -#define IOCTL_ND_MR_DEREGISTER IOCTL_ND_MR( 4 ) -#define IOCTL_NDK_MR_REGISTER IOCTL_ND_MR( 5 ) -#define IOCTL_ND_MR_MAX_OPERATION 6 - -/* Memory Window IOCTLs */ -#define IOCTL_ND_MW_CREATE IOCTL_ND_MW( 0 ) -#define IOCTL_ND_MW_FREE IOCTL_ND_MW( 1 ) -#define IOCTL_ND_MW_MAX_OPERATION 2 - -/* Shared Receive Queue IOCTLs */ -#define IOCTL_ND_SRQ_CREATE IOCTL_ND_SRQ( 0 ) -#define IOCTL_ND_SRQ_FREE IOCTL_ND_SRQ( 1 ) -#define IOCTL_ND_SRQ_CANCEL_IO IOCTL_ND_SRQ( 2 ) -#define IOCTL_ND_SRQ_GET_AFFINITY IOCTL_ND_SRQ( 3 ) -#define IOCTL_ND_SRQ_MODIFY IOCTL_ND_SRQ( 4 ) -#define IOCTL_ND_SRQ_NOTIFY IOCTL_ND_SRQ( 5 ) -#define IOCTL_ND_SRQ_MAX_OPERATION 6 - -/* Connector IOCTLs */ -#define IOCTL_ND_CONNECTOR_CREATE IOCTL_ND_CONNECTOR( 0 ) -#define IOCTL_ND_CONNECTOR_FREE IOCTL_ND_CONNECTOR( 1 ) -#define IOCTL_ND_CONNECTOR_CANCEL_IO IOCTL_ND_CONNECTOR( 2 ) -#define IOCTL_ND_CONNECTOR_BIND IOCTL_ND_CONNECTOR( 3 ) -#define IOCTL_ND_CONNECTOR_CONNECT IOCTL_ND_CONNECTOR( 4 ) -#define IOCTL_ND_CONNECTOR_COMPLETE_CONNECT IOCTL_ND_CONNECTOR( 5 ) -#define IOCTL_ND_CONNECTOR_ACCEPT IOCTL_ND_CONNECTOR( 6 ) -#define IOCTL_ND_CONNECTOR_REJECT IOCTL_ND_CONNECTOR( 7 ) -#define IOCTL_ND_CONNECTOR_GET_READ_LIMITS IOCTL_ND_CONNECTOR( 8 ) -#define IOCTL_ND_CONNECTOR_GET_PRIVATE_DATA IOCTL_ND_CONNECTOR( 9 ) -#define IOCTL_ND_CONNECTOR_GET_PEER_ADDRESS IOCTL_ND_CONNECTOR( 10 ) -#define IOCTL_ND_CONNECTOR_GET_ADDRESS IOCTL_ND_CONNECTOR( 11 ) -#define IOCTL_ND_CONNECTOR_NOTIFY_DISCONNECT IOCTL_ND_CONNECTOR( 12 ) -#define IOCTL_ND_CONNECTOR_DISCONNECT IOCTL_ND_CONNECTOR( 13 ) -#define IOCTL_ND_CONNECTOR_MAX_OPERATION 14 - -/* Listener IOCTLs */ -#define IOCTL_ND_LISTENER_CREATE IOCTL_ND_LISTENER( 0 ) -#define IOCTL_ND_LISTENER_FREE IOCTL_ND_LISTENER( 1 ) -#define IOCTL_ND_LISTENER_CANCEL_IO IOCTL_ND_LISTENER( 2 ) -#define IOCTL_ND_LISTENER_BIND IOCTL_ND_LISTENER( 3 ) -#define IOCTL_ND_LISTENER_LISTEN IOCTL_ND_LISTENER( 4 ) -#define IOCTL_ND_LISTENER_GET_ADDRESS IOCTL_ND_LISTENER( 5 ) -#define IOCTL_ND_LISTENER_GET_CONNECTION_REQUEST IOCTL_ND_LISTENER( 6 ) -#define IOCTL_ND_LISTENER_MAX_OPERATION 7 - -/* Queue Pair IOCTLs */ -#define IOCTL_ND_QP_CREATE IOCTL_ND_QP( 0 ) -#define IOCTL_ND_QP_CREATE_WITH_SRQ IOCTL_ND_QP( 1 ) -#define IOCTL_ND_QP_FREE IOCTL_ND_QP( 2 ) -#define IOCTL_ND_QP_FLUSH IOCTL_ND_QP( 3 ) -#define IOCTL_ND_QP_MAX_OPERATION 4 - -/* Kernel-mode only IOCTLs (IRP_MJ_INTERNAL_DEVICE_CONTROL) */ -#define IOCTL_NDV_PARTITION_RESOLVE_ADAPTER_ID IOCTL_ND_VIRTUAL_PARTITION( 0 ) -#define IOCTL_NDV_PARTITION_CREATE IOCTL_ND_VIRTUAL_PARTITION( 1 ) -#define IOCTL_NDV_PARTITION_FREE IOCTL_ND_VIRTUAL_PARTITION( 2 ) -#define IOCTL_NDV_PARTITION_BIND IOCTL_ND_VIRTUAL_PARTITION( 3 ) -#define IOCTL_NDV_PARTITION_UNBIND IOCTL_ND_VIRTUAL_PARTITION( 4 ) -#define IOCTL_NDV_PARTITION_BIND_LUID IOCTL_ND_VIRTUAL_PARTITION( 5 ) -#define IOCTL_NDV_PARTITION_MAX_OPERATION 6 - - -#define MB_SHIFT 20 - - -/* Ringbuffer size for the channel */ -#define NDV_NUM_PAGES_IN_RING_BUFFER 64 - -#define NDV_MAX_PACKETS_PER_RECEIVE 8 - -#define NDV_MAX_PACKET_COUNT 16304 - -#define NDV_MAX_NUM_OUTSTANDING_RECEIVED_PACKETS (16304) -#define NDV_MAX_HANDLE_TABLE_SIZE (16304) -#define NDV_HOST_MAX_HANDLE_TABLE_SIZE (NDV_MAX_HANDLE_TABLE_SIZE * 16) - - -#define NDV_MAX_MAPPINGS 4 - -#define NDV_STATE_NONE 0x00000000 -#define NDV_STATE_CREATED 0x00000001 -#define NDV_STATE_CONNECTING 0x00000002 -#define NDV_STATE_INITIALIZING 0x00000003 -#define NDV_STATE_OPERATIONAL 0xEFFFFFFF -#define NDV_STATE_FAILED 0xFFFFFFFF - - -#define NDV_MAX_PRIVATE_DATA_SIZE 64 -#define NDV_MAX_IOCTL_SIZE 256 - -/* max size of buffer for vector of ND_MAPPING */ -#define NDV_MAX_MAPPING_BUFFER_SIZE \ - (NDV_MAX_MAPPINGS * sizeof(union nd_mapping)) - -/* max expected ioctl buffer size from users */ -#define NDV_MAX_IOCTL_BUFFER_SIZE \ - (NDV_MAX_IOCTL_SIZE + \ - NDV_MAX_MAPPING_BUFFER_SIZE + \ - NDV_MAX_PRIVATE_DATA_SIZE) - -/* max PFN array for inline buffers */ -#define NDV_MAX_INLINE_PFN_ARRAY_LENGTH 32 - -/* Field header size for inline buffer */ -#define NDV_MAX_MAPPING_PACKET_FILED_BUFFER_SIZE \ - (NDV_MAX_MAPPINGS * sizeof(NDV_PACKET_FIELD)) - -/* Max for a single field */ - -#define NDV_MAX_SINGLE_MAPPING_FIELD ( sizeof(GPA_RANGE) + \ - (sizeof(PFN_NUMBER) * NDV_MAX_INLINE_PFN_ARRAY_LENGTH)) - -/* Max for all inine data */ - -#define NDV_MAX_MAPPING_DATA_SIZE (NDV_MAX_MAPPING_PACKET_FILED_BUFFER_SIZE + \ - (NDV_MAX_MAPPINGS * NDV_MAX_SINGLE_MAPPING_FIELD)) - - -#define NDV_MAX_PACKET_HEADER_SIZE 256 - -#define NDV_MAX_PACKET_SIZE (NDV_MAX_PACKET_HEADER_SIZE + \ - NDV_MAX_IOCTL_BUFFER_SIZE + \ - NDV_MAX_MAPPING_DATA_SIZE) - -/* Well known message type INIT is defined for the channel - * not for the protocol. - */ - -#define NDV_PACKET_TYPE_INIT 0xFFFFFFFF - -/* Invalid protocol version to to identify uninitialized channels */ - -#define NDV_PROTOCOL_VERSION_INVALID 0xFFFFFFFF - -/* Flags that control the bahavior of packet handling */ - -enum ndv_packet_options { - NDV_PACKET_OPTION_NONE = 0x00, - - /* Indicates that the ExternalDataMdl parameter is expectected to be - * passed and must be handled in the reciever. This call must be - * handled specially to ensure that the MDL can be created correctly. - */ - NDV_PACKET_OPTION_EXTERNAL_DATA = 0x01, - - /* Inicates that the reciever must execution the handler at passive. */ - NDV_PACKET_OPTIONS_REQUIRES_PASSIVE = 0x02, - - /* Indicates that the sender does not expect and is not waiting for a - * response packet. - */ - NDV_PACKET_OPTIONS_POST = 0x04, -}; - -#define NDV_PACKET_TYPE(id_, opt_) \ - (((opt_)<<24) | (id_)) - -#define NDV_PACKET_TYPE_OPTIONS(type_) \ - (((type_) >> 24) & 0xFF) - -#define NDV_PACKET_TYPE_ID(type_) \ - ((type_) & 0xFFFFFF) \ - -#define NDV_ADD_PACKET_OPTION(type_, opt_) \ - (type_) |= (opt_<<24) - -/* The header value sent on all packets */ -union ndv_packet_hdr { - - struct { - /* The type of packet. - * This value should be created with the NDV_PACKET_TYPE macro - * to include all packet options within the packet type. - */ - u32 packet_type; - /* The size of the entire fixed message structure that exists - * before the data. This must be >= sizeof(NDV_PACKET_HEADER) - */ - u32 hdr_sz; - /* This size of the data that follows the message - * data_sz + hdr_sz size gives the total size of - * the buffer that is used. - */ - u32 data_sz; - /* The status code used to indicate success or failure. - * It is only used in completions and during responses. - */ - u32 status; //KYS: NTSTATUS? - }; - - u64 padding[2]; //KYS: why? -}; - - -/* The core INIT packet. This message is defined in the channel - * not in the protocol. This message should never change size - * or behavior, as it could impact compatibility in the future. - * This packet is used to negotiate the protocol version, so chaning - * this size could break backward compat. - */ - -union ndv_packet_init { - struct { - u32 packet_type; - u32 protocol_version; - u32 flags; - }; - u64 padding[2]; -} __packed; - -#define NDV_PACKET_INIT_SIZE 16 - -/* Data packing flags used for accessing the dynamic fields inside a packet */ -#define NDV_DATA_PACKING_2 0x1 -#define NDV_DATA_PACKING_4 0x3 -#define NDV_DATA_PACKING_8 0x7 - - -#define NDV_PROTOCOL_VERSION_1 0x0100 -#define NDV_PROTOCOL_VERSION_CURRENT NDV_PROTOCOL_VERSION_1 -#define NDV_PROTOCOL_VERSION_COUNT 1 - -struct ndv_pkt_field { - u32 size; - u32 offset; -}; - -enum ndv_pkt_id { - NDV_PKT_UNKNOWN = 0, - /* Version 1 Message ID's */ - NDV_PKT_ID1_BIND, - NDV_PKT_ID1_CREATE, - NDV_PKT_ID1_CLEANUP, - NDV_PKT_ID1_CANCEL, - NDV_PKT_ID1_CONTROL, - NDV_PKT_ID1_COMPLETE, - NDV_PKT_ID1_INIT_RESOURCES, -}; - -/* The guest will send this as the first messages just after init - * The resources are reserved per channel. - */ - -struct ndv_pkt_hdr_init_resources_1 { - - union ndv_packet_hdr pkt_hdr; - u16 io_space_sz_mb; - u64 io_space_start; - -}; - - - -/* The guest will send this packet to the host after channel init - * to query support for the adapters that are registered. - */ - -struct ndv_pkt_hdr_bind_1 { - union ndv_packet_hdr pkt_hdr; - bool unbind; - union nd_sockaddr_inet ip_address; - struct if_physical_addr phys_addr; - u64 guest_id; -}; - -union ndv_context_handle { - u64 val64; - struct { - u32 local; - u32 remote; - }; -}; - -struct ndv_pkt_hdr_create_1 { - union ndv_packet_hdr pkt_hdr; - - /* Identifies the object used to track this file handle on both - * the guest and the host. When sent from the guest, it will contain - * the guest handle. On success, the host will populate and return - * it's handle value as well. - */ - - union ndv_context_handle handle; - - /* The parameters sent to the CreateFile call */ - u32 access_mask; - u32 open_options; - - u16 file_attributes; //KYS: This field must be 64 bit aligned - - u16 share_access; //KYS - - u32 kys_padding; //KYS - - u16 ea_length; //KYS; needs to be 64 bit aligned; what is ea length - unused -}; - - -struct ndv_pkt_hdr_cleanup_1 { - union ndv_packet_hdr pkt_hdr; - - /* Identifies the object used to track this file handle on both - * the guest and the host. When sent from the guest, it will contain - * the both the guest and host handle values. The host will use this - * value to cleanup its resource, then update its portion of the handle - * to NDV_HANDLE_NULL before returning the data back to the guest. - */ - union ndv_context_handle handle; -}; - -struct ndv_pkt_hdr_cancel_1 { - union ndv_packet_hdr pkt_hdr; - union ndv_context_handle file_handle; - union ndv_context_handle irp_handle; -}; - -struct ndv_bind_port_info { - //LUID authentication_id; //KYS: LUID? - bool is_admin; -}; - -struct ndv_extended_data_flds { - union { - u32 field_count; - u64 padding; - }; - - //struct ndv_pkt_field fields[ANYSIZE_ARRAY]; //KYS? -}; - - -struct ndv_packet_hdr_control_1 { - union ndv_packet_hdr pkt_hdr; - /* Identifies the object used to track this file handle on both - * the guest and the host. This should always have both guest - * and host handle values inside it. - */ - - union ndv_context_handle file_handle; - - /* The handle information for the allocated irp context object. - * This information is used when the host/guest starts the cancelation - */ - union ndv_context_handle irp_handle; - - /* The input data describing in the IO control parameters */ - - u32 io_cntrl_code; - u32 output_buf_sz; - u32 input_buf_sz; - u32 input_output_buf_offset; - - /* These are used in the return message to indicate the status of the IO - * operation and the amount of data written to the output buffer. - */ - u32 io_status; //KYS: NTSTATUS? - u32 bytes_returned; - - /* This contains the field information for additional data that is sent - * with the packet that is IOCTL specific. - */ - - struct ndv_pkt_field extended_data; -}; - -/* - * Include MLX specific defines. - */ - -#include "mx_abi.h" - -/* Driver specific state. - */ - -/* - * We need to have host open a file; some - * Windows constants for open. - */ -#define STANDARD_RIGHTS_ALL (0x001F0000L) -#define FILE_ATTRIBUTE_NORMAL (0x80) -#define FILE_SHARE_READ (0x00000001) -#define FILE_SHARE_WRITE (0x00000002) -#define FILE_SHARE_DELETE (0x00000004) -#define FILE_FLAG_OVERLAPPED (0x40000000) -#define FILE_SHARE_ALL (FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE) -#define CREATE_ALWAYS (2) -#define OPEN_EXISTING (3) - -#define RTL_NUMBER_OF(_x) \ - sizeof(_x)/sizeof(_x[0]) -/* - * The context structure tracks the open state. - */ - -/* - * Packet layout for open adaptor. - */ - -/* - * Packet for querying the address list. - */ - -union query_addr_list_ioctl { - struct nd_handle in; - union nd_sockaddr_inet out[16]; //KYS a max of 16 addresses -}; - -struct pkt_query_addr_list { - struct ndv_packet_hdr_control_1 hdr; - union query_addr_list_ioctl ioctl; - unsigned long activity_id; -}; - - -struct pkt_fld { - u32 size; - u32 offset; -}; - -struct fld_data { - union { - u64 padding; - }; -}; - -struct extended_data_oad { - union { - u32 cnt; - u64 padding; - }; - /* offsets are from start of extended data struct - * and should start on 8 byte boundary - */ - struct pkt_fld fields[IBV_GET_CONTEXT_MAPPING_MAX]; -}; - -union oad_ioctl { - struct nd_open_adapter input; - struct nd_resource_descriptor resrc_desc; -}; - -union oad_mappings { - struct ibv_get_context_req ctx_input; - struct ibv_get_context_resp ctx_output; -}; - -struct pkt_nd_open_adapter { - struct ndv_packet_hdr_control_1 hdr; - - union oad_ioctl ioctl; - union oad_mappings mappings; - - /* - * Extended data. - */ - struct extended_data_oad ext_data; -}; - -/* - * Create CQ IOCTL. - */ - -struct cq_db_gpa { - u32 byte_count; - u32 byte_offset; - u64 pfn_array[2]; -}; - -struct cq_sn_gpa { - u32 byte_count; - u32 byte_offset; - u64 pfn_array[2]; -}; - -struct create_cq_ext_data { - union { - u32 cnt; - u64 padding; - }; - /* offsets are from start of extended data struct - * and should start on 8 byte boundary - */ - struct pkt_fld fields[MLX4_IB_CREATE_CQ_MAPPING_MAX]; - struct cq_db_gpa db_gpa; - struct cq_sn_gpa sn_gpa; - struct gpa_range cqbuf_gpa; -}; - -union create_cq_ioctl { - struct nd_create_cq input; - struct nd_resource_descriptor resrc_desc; -}; - -union create_cq_mappings { - struct ibv_create_cq cq_in; - struct ibv_create_cq_resp cq_resp; -}; - -struct pkt_nd_create_cq { - struct ndv_packet_hdr_control_1 hdr; - - union create_cq_ioctl ioctl; - union create_cq_mappings mappings; - - /* - * Extended data. - */ - struct create_cq_ext_data ext_data; -}; - -/* - * IOCTL to free CQ. - */ -struct free_cq_ioctl { - struct nd_handle in; -}; - -struct pkt_nd_free_cq { - struct ndv_packet_hdr_control_1 hdr; - - struct free_cq_ioctl ioctl; -}; - - -/* - * IOCTL to QUERY CQ - CQ NOTIFY - */ - -struct notify_cq_ioctl { - struct nd_cq_notify in; -}; - -struct pkt_nd_notify_cq { - struct ndv_packet_hdr_control_1 hdr; - struct notify_cq_ioctl ioctl; -}; - -/* - * IOCTL to Create a listner - */ - -struct nd_ep_create { - struct nd_handle hdr; - bool to_semantics; - unsigned long activity_id; -}; - -union listener_cr_ioctl { - struct nd_ep_create in; - u64 out; -}; - -struct pkt_nd_cr_listener { - struct ndv_packet_hdr_control_1 hdr; - union listener_cr_ioctl ioctl; -}; - -/* - * IOCTL to free listener. - */ - -struct listener_free_ioctl { - struct nd_handle in; -}; - -struct pkt_nd_free_listener { - struct ndv_packet_hdr_control_1 hdr; - struct listener_free_ioctl ioctl; -}; - -/* - * IOCTL for listener cancel IO. - */ -struct listener_cancelio_ioctl { - struct nd_handle in; -}; - -struct pkt_nd_cancelio_listener { - struct ndv_packet_hdr_control_1 hdr; - struct listener_cancelio_ioctl ioctl; -}; - -/* - * IOCTL for LISTENER BIND - */ - -union listener_bind_ioctl { - struct ndk_bind in; -}; - -struct pkt_nd_bind_listener { - struct ndv_packet_hdr_control_1 hdr; - union listener_bind_ioctl ioctl; -}; - -/* - * After the listener is bound, enable - * listening. - */ - -union listener_listen_ioctl { - struct nd_listen in; -}; - -struct pkt_nd_listen_listener { - struct ndv_packet_hdr_control_1 hdr; - union listener_listen_ioctl ioctl; -}; - -/* - * IOCTL for getting the adddress from listener. - * - */ - -union listener_get_addr_ioctl { - struct nd_handle in; - union nd_sockaddr_inet out; -}; - -struct pkt_nd_get_addr_listener { - struct ndv_packet_hdr_control_1 hdr; - union listener_get_addr_ioctl ioctl; -}; - -/* - * IOCTL to get a connection from a listener. - */ - -union listener_get_connection_ioctl { - struct nd_get_connection_request in; - union nd_sockaddr_inet out; -}; - -struct pkt_nd_get_connection_listener { - struct ndv_packet_hdr_control_1 hdr; - union listener_get_connection_ioctl ioctl; -}; - - -/* - * Connector IOCTLs - */ - -/* - * IOCTL to create connector. - */ - -union connector_cr_ioctl { //KYS should this be a union or struct? - struct nd_ep_create in; - u64 out; -}; - -struct pkt_nd_cr_connector { - struct ndv_packet_hdr_control_1 hdr; - union connector_cr_ioctl ioctl; //KYS: union or struct -}; - -/* - * IOCTL to free connector. - */ - -struct connector_free_ioctl { - struct nd_handle in; -}; - -struct pkt_nd_free_connector { - struct ndv_packet_hdr_control_1 hdr; - struct connector_free_ioctl ioctl; -}; - -/* - * IOCTL to cancel I/O on a connector. - */ - -struct connector_cancelio_ioctl { - struct nd_handle in; -}; - -struct pkt_nd_cancelio_connector { - struct ndv_packet_hdr_control_1 hdr; - struct connector_cancelio_ioctl ioctl; -}; - -/* - * IOCTL to Bind an address to the connector. - */ - -union connector_bind_ioctl { - struct ndk_bind in; -}; - -struct pkt_nd_bind_connector { - struct ndv_packet_hdr_control_1 hdr; - union connector_bind_ioctl ioctl; -}; - -/* - * IOCTL to connect a connector. - */ - -struct connector_connect_in { - struct nd_connect hdr; - u8 retry_cnt; - u8 rnr_retry_cnt; - u8 priv_data[56]; - unsigned long activity_id; -}; - -union connector_connect_ioctl { - struct connector_connect_in in; -}; - -struct pkt_nd_connector_connect { - struct ndv_packet_hdr_control_1 hdr; - union connector_connect_ioctl ioctl; -}; - -/* - * IOCTL for connector complete connect - */ - -struct complete_connect_in { - struct nd_handle hdr; - u8 rnr_nak_to; - unsigned long activity_id; -}; - -struct complete_connect_out { - enum ibv_qp_state state; -}; - -union connector_complete_connect_ioctl { - struct complete_connect_in in; - struct complete_connect_out out; -}; - -struct pkt_nd_connector_connect_complete { - struct ndv_packet_hdr_control_1 hdr; - union connector_complete_connect_ioctl ioctl; -}; - - -#define MAX_PRIVATE_DATA_LEN 148 - -/* - * IOCTL for connector accept. - */ - -struct connector_accept_in { - struct nd_accept hdr; - u8 rnr_retry_cnt; - u8 rnr_nak_to; - u8 private_data[MAX_PRIVATE_DATA_LEN]; - unsigned long activity_id; -}; - -struct connector_accept_out { - enum ibv_qp_state state; -}; - -union connector_accept_ioctl { - struct connector_accept_in in; - struct connector_accept_out out; -}; - -struct pkt_nd_connector_accept { - struct ndv_packet_hdr_control_1 hdr; - union connector_accept_ioctl ioctl; -}; - -/* - * IOCTL for connector to reject a connection. - */ - -struct connector_reject_in { - struct nd_reject hdr; - u8 private_data[MAX_PRIVATE_DATA_LEN]; -}; - -struct connector_reject_out { - enum ibv_qp_state state; -}; - -union connector_reject_ioctl { - struct connector_reject_in in; - struct connector_reject_out out; -}; - -struct pkt_nd_connector_reject { - struct ndv_packet_hdr_control_1 hdr; - union connector_reject_ioctl ioctl; -}; - -/* - * IOCTL to get connector read limits. - */ - -struct connector_get_rd_limits_in { - struct nd_handle in; -}; - -struct connector_get_rd_limits_out { - struct nd_read_limits out; -}; - -union connector_get_rd_limits_ioctl { - struct connector_get_rd_limits_in in; - struct connector_get_rd_limits_out out; -}; - -struct pkt_nd_connector_get_rd_limits { - struct ndv_packet_hdr_control_1 hdr; - union connector_get_rd_limits_ioctl ioctl; -}; - -/* - * IOCTL to get connector private data. - */ -union connector_get_priv_data_ioctl { - struct nd_handle in; - u8 out[MAX_PRIVATE_DATA_LEN]; -}; - -struct pkt_nd_connector_get_priv_data { - struct ndv_packet_hdr_control_1 hdr; - union connector_get_priv_data_ioctl ioctl; -}; - - -/* - * IOCTL get peer address. - */ - -union connector_get_peer_addr_ioctl { - struct nd_handle in; - union nd_sockaddr_inet out; -}; - -struct pkt_nd_connector_get_peer_addr { - struct ndv_packet_hdr_control_1 hdr; - union connector_get_peer_addr_ioctl ioctl; -}; - -/* - * IOCTL to get connector address. - */ - -union connector_get_addr_ioctl { - struct nd_handle in; - union nd_sockaddr_inet out; -}; - -struct pkt_nd_connector_get_addr { - struct ndv_packet_hdr_control_1 hdr; - union connector_get_addr_ioctl ioctl; -}; - -/* - * IOCTL for disconnect notification. - */ - -union connector_notify_disconnect_ioctl { - struct nd_handle in; -}; - -struct pkt_nd_connector_notify_disconnect { - struct ndv_packet_hdr_control_1 hdr; - union connector_notify_disconnect_ioctl ioctl; -}; - -union connector_disconnect_ioctl { - struct nd_handle in; -}; - -struct pkt_nd_connector_disconnect { - struct ndv_packet_hdr_control_1 hdr; - union connector_notify_disconnect_ioctl ioctl; -}; - -/* - * IOCTLs for QP operations. - */ - -/* - * Create qp IOCTL. - */ - -struct qp_db_gpa { - u32 byte_count; - u32 byte_offset; - u64 pfn_array[1]; -}; - -struct create_qp_ext_data { - union { - u32 cnt; - u64 padding; - }; - /* offsets are from start of extended data struct - * and should start on 8 byte boundary - */ - struct pkt_fld fields[MLX4_IB_CREATE_QP_MAPPINGS_MAX]; - struct qp_db_gpa db_gpa; - struct gpa_range qpbuf_gpa; -}; - -union create_qp_ioctl { - struct nd_create_qp input; - struct nd_resource_descriptor resrc_desc; -}; - -union create_qp_mappings { - struct ibv_create_qp qp_in; - struct ibv_create_qp_resp qp_resp; -}; - -struct pkt_nd_create_qp { - struct ndv_packet_hdr_control_1 hdr; - - union create_qp_ioctl ioctl; - union create_qp_mappings mappings; - - /* - * Extended data. - */ - struct create_qp_ext_data ext_data; -}; - -/* - * IOCTL to flush a QP. - */ -struct flush_qp_ioctl { - struct nd_handle in; - enum ibv_qp_state out; -}; - -struct pkt_nd_flush_qp { - struct ndv_packet_hdr_control_1 hdr; - struct flush_qp_ioctl ioctl; -}; - -/* - * Memory Region IOCTLS - */ -union create_mr_ioctl { - struct nd_handle in; - u64 out; -}; - -struct pkt_nd_create_mr { - struct ndv_packet_hdr_control_1 hdr; - union create_mr_ioctl ioctl; -}; - -struct mr_out { - u32 lkey; - u32 rkey; - unsigned long activity_id; -}; - - -union register_mr_ioctl { - struct nd_mr_register in; - struct mr_out out; -}; - -struct pkt_nd_register_mr { - struct ndv_packet_hdr_control_1 hdr; - union register_mr_ioctl ioctl; -}; - -struct deregister_mr_ioctl { - struct nd_handle in; -}; - -struct pkt_nd_deregister_mr { - struct ndv_packet_hdr_control_1 hdr; - struct deregister_mr_ioctl ioctl; -}; - -/* - * IOCTL to disconnect connector - */ - -/* - * Create PD IOCTL. - */ -struct nd_create_pd_ioctl { - union { - struct nd_handle in; - u64 out_handle; - }; - struct ibv_alloc_pd_resp resp; -}; - -struct pkt_nd_pd_create { - struct ndv_packet_hdr_control_1 hdr; - struct nd_create_pd_ioctl ioctl; -}; - -/* - * Free Handle. Check the layout with Luke. - * - */ -struct free_handle_ioctl { - struct nd_handle in; -}; - -struct pkt_nd_free_handle { - struct ndv_packet_hdr_control_1 hdr; - struct free_handle_ioctl ioctl; -}; - -/* - * Cancel I/O. - */ - -struct cancel_io_ioctl { - struct nd_handle in; -}; - -struct pkt_nd_cancel_io { - struct ndv_packet_hdr_control_1 hdr; - struct cancel_io_ioctl ioctl; -}; - -/* - * Connector states: - */ - -enum connector_state { - HVND_CON_INCOMING, - HVND_CON_INCOMING_ESTABLISHED, - HVND_CON_INCOMING_REJECTED, - HVND_CON_OUTGOING_REQUEST -}; - - -/* - * Adaptor query IOCTL. - */ -struct nd_adap_query_ioctl { - union { - struct nd_adapter_query ad_q; - struct adapter_info_v2 ad_info; - }; -}; - -struct pkt_nd_query_adaptor { - struct ndv_packet_hdr_control_1 hdr; - struct nd_adap_query_ioctl ioctl; -}; - -struct nd_ioctl { - union { - struct nd_handle handle; - u8 raw_buffer[NDV_MAX_IOCTL_BUFFER_SIZE]; - }; -}; - -struct pkt_nd_provider_ioctl { - struct ndv_packet_hdr_control_1 hdr; - struct nd_ioctl ioctl; -}; - -struct hvnd_ib_pd { - struct ib_pd ibpd; - u32 pdn; - u64 handle; -}; - -struct hvnd_work { - struct work_struct work; - void *callback_arg; -}; - -struct hvnd_disconnect_work { - struct work_struct work; - int status; - void *callback_arg; -}; - -/* -struct hvnd_delayed_work { - struct delayed_work work; - void *callback_arg; -}; -*/ - -enum hvnd_cm_state { - hvnd_cm_idle = 0, - hvnd_cm_connect_reply_sent, //active - hvnd_cm_connect_reply_refused, - hvnd_cm_connect_received, //active - hvnd_cm_connect_request_sent, //passive - hvnd_cm_accept_sent, - hvnd_cm_close_sent, - hvnd_cm_established_sent, -}; - -struct incoming_pkt { - struct list_head list_entry; - char pkt[0]; -}; - -struct hvnd_ep_obj { -/* - spinlock_t ep_lk; - bool to_be_destroyed; - bool io_outstanding; - - wait_queue_head_t wait; - bool stopped; - atomic_t process_refcnt; // how many NDV_PKT_ID1_COMPLETE packets we are currently processing -*/ - bool stopping; - wait_queue_head_t wait_pending; - atomic_t nr_requests_pending; - - enum nd_resource_type type; - enum connector_state state; //KYS need to look at locking - struct iw_cm_id *cm_id; - enum hvnd_cm_state cm_state; - struct completion block_event; - struct completion disconnect_event; - struct completion connector_accept_event; - int connector_accept_status; - u64 ep_handle; - spinlock_t incoming_pkt_list_lock; - struct list_head incoming_pkt_list; - struct hvnd_ep_obj *parent; - struct hvnd_dev *nd_dev; - struct hvnd_ucontext *uctx; - struct hvnd_work wrk; - struct hvnd_cq *cq; - u8 ord; - u8 ird; - char priv_data[MAX_PRIVATE_DATA_LEN]; - bool incoming; - atomic_t disconnect_notified; - u64 outstanding_handle; - u32 local_irp; - struct hvnd_ep_obj *outstanding_ep; - struct pkt_nd_connector_connect connector_connect_pkt; - int connector_connect_retry; -}; - -struct hvnd_ucontext { - struct ib_ucontext ibucontext; - struct list_head listentry; - struct ndv_pkt_hdr_create_1 create_pkt; - struct ndv_pkt_hdr_create_1 create_pkt_ovl; /* Overlap handle */ - struct pkt_nd_provider_ioctl pr_init_pkt; - union ndv_context_handle file_handle; - union ndv_context_handle file_handle_ovl; - - struct pkt_nd_open_adapter o_adap_pkt; - - u64 adaptor_hdl; - - /* - * Protection domain state. - */ - struct pkt_nd_pd_create pd_cr_pkt; - - u64 uar_base; - u64 bf_base; - u32 bf_buf_size; - u32 bf_offset; - u32 cqe_size; - u32 max_qp_wr; - u32 max_sge; - u32 max_cqe; - u32 num_qps; - - /* - * State to manage dorbell pages: - */ - struct list_head db_page_list; - struct mutex db_page_mutex; - - atomic_t refcnt; - -}; - -struct hvnd_dev { - struct ib_device ibdev; - struct hv_device *hvdev; - u32 device_cap_flags; - unsigned char nports; - bool ib_active; - - /* State to manage interaction with the host. - */ - - spinlock_t uctxt_lk; - struct list_head listentry; - - unsigned long mmio_sz; - unsigned long mmio_start_addr; - struct resource mmio_resource; - void *mmio_virt; - - unsigned long negotiated_version; - union ndv_packet_init init_pkt; - struct ndv_pkt_hdr_init_resources_1 resources; - struct ndv_pkt_hdr_bind_1 bind_pkt; - - struct ndv_pkt_hdr_create_1 global_create_pkt; - union ndv_context_handle global_file_handle; - - struct semaphore query_pkt_sem; - bool query_pkt_set; - struct pkt_nd_query_adaptor query_pkt; - - /* - * ID tables. - */ - spinlock_t id_lock; - - struct idr cqidr; - struct idr qpidr; - struct idr mmidr; - struct idr irpidr; - struct idr uctxidr; - atomic_t open_cnt; - - char ip_addr[4]; - char mac_addr[6]; - struct completion addr_set; - int bind_complete; - struct mutex bind_mutex; -}; - -struct hvnd_cq { - struct ib_cq ibcq; - void *cq_buf; - void *db_addr; - u32 arm_sn; - u32 entries; - - u32 cqn; - u32 cqe; - u64 cq_handle; - - struct ib_umem *umem; - struct ib_umem *db_umem; - struct mlx4_ib_user_db_page user_db_page; - struct hvnd_ucontext *uctx; - struct hvnd_ep_obj ep_object; //KYS need to clean this up; have a cq irp state - bool monitor; - bool upcall_pending; -}; - -struct hvnd_qp { - struct ib_qp ibqp; - void *qp_buf; - void *db_addr; - u32 buf_size; - u8 port; - struct hvnd_dev *nd_dev; - - __u8 log_sq_bb_count; - __u8 log_sq_stride; - __u8 sq_no_prefetch; - - int rq_wqe_cnt; - int rq_wqe_shift; - int rq_max_gs; - - int sq_wqe_cnt; - int sq_wqe_shift; - int sq_max_gs; - - u32 max_inline_data; - - u32 initiator_q_depth; - u32 initiator_request_sge; - - u32 receive_q_depth; - u32 receive_request_sge; - - struct hvnd_cq *recv_cq; - struct hvnd_cq *send_cq; - - u64 receive_cq_handle; - u64 initiator_cq_handle; - u64 pd_handle; - - u64 qp_handle; - u32 qpn; - u32 max_send_wr; - u32 max_recv_wr; - u32 max_send_sge; - u32 max_recv_sge; - - struct ib_umem *umem; - struct ib_umem *db_umem; - struct mlx4_ib_user_db_page user_db_page; - struct hvnd_ucontext *uctx; - struct iw_cm_id *cm_id; - - /* - * Current QP state; need to look at locking. - * XXXKYS - */ - enum ib_qp_state qp_state; - bool cq_notify; - wait_queue_head_t wait; - atomic_t refcnt; - struct hvnd_ep_obj *connector; -}; - -struct hvnd_mr { - struct ib_mr ibmr; - struct hvnd_ib_pd *pd; - struct ib_umem *umem; - u64 start; - u64 length; - u64 virt; - int acc; - u64 mr_handle; - u32 mr_lkey; - u32 mr_rkey; -}; - -struct hvnd_cookie { - struct completion host_event; - void *pkt; -}; - -/* - * Definitions to retrieve the IP address. - */ - -#define HVND_CURRENT_VERSION 0 - -struct hvnd_ipaddr_tuple { - char mac_address[ETH_ALEN]; - struct sockaddr addr; -}; - -struct hvnd_msg { - int status; - struct hvnd_ipaddr_tuple ip_tuple; -}; - -static inline struct hvnd_ib_pd *to_nd_pd(struct ib_pd *pd) -{ - return container_of(pd, struct hvnd_ib_pd, ibpd); -} - -static inline struct hvnd_dev *to_nd_dev(struct ib_device *ibdev) -{ - return container_of(ibdev, struct hvnd_dev, ibdev); -} - -static inline struct hvnd_cq *to_nd_cq(struct ib_cq *ibcq) -{ - return container_of(ibcq, struct hvnd_cq, ibcq); -} - -static inline struct hvnd_qp *to_nd_qp(struct ib_qp *ibqp) -{ - return container_of(ibqp, struct hvnd_qp, ibqp); -} - -static inline struct hvnd_ucontext *to_nd_context(struct ib_ucontext *ibucontext) -{ - return container_of(ibucontext, struct hvnd_ucontext, ibucontext); -} - -static inline struct hvnd_ucontext *get_uctx_from_pd(struct ib_pd *pd) -{ - return to_nd_context(pd->uobject->context); -} - -static inline struct hvnd_mr *to_nd_mr(struct ib_mr *ibmr) -{ - return container_of(ibmr, struct hvnd_mr, ibmr); -} -/* - * ID management. - */ - -static inline int insert_handle(struct hvnd_dev *dev, struct idr *idr, - void *handle, u32 id) -{ - int ret; - unsigned long flags; - - idr_preload(GFP_KERNEL); - spin_lock_irqsave(&dev->id_lock, flags); - - ret = idr_alloc(idr, handle, id, id + 1, GFP_ATOMIC); - - spin_unlock_irqrestore(&dev->id_lock, flags); - idr_preload_end(); - - BUG_ON(ret == -ENOSPC); - return ret < 0 ? ret : 0; -} - -static inline void remove_handle(struct hvnd_dev *dev, struct idr *idr, u32 id) -{ - unsigned long flags; - - spin_lock_irqsave(&dev->id_lock, flags); - idr_remove(idr, id); - spin_unlock_irqrestore(&dev->id_lock, flags); -} - -static inline struct hvnd_cq *get_cqp(struct hvnd_dev *dev, u32 cqid) -{ - struct hvnd_cq *cqp; - unsigned long flags; - - spin_lock_irqsave(&dev->id_lock, flags); - cqp = idr_find(&dev->cqidr, cqid); - spin_unlock_irqrestore(&dev->id_lock, flags); - - return cqp; -} - -static inline struct hvnd_qp *get_qpp(struct hvnd_dev *dev, u32 qpid) -{ - struct hvnd_qp *qpp; - unsigned long flags; - - spin_lock_irqsave(&dev->id_lock, flags); - qpp = idr_find(&dev->qpidr, qpid); - spin_unlock_irqrestore(&dev->id_lock, flags); - - return qpp; -} - -static inline struct hvnd_ucontext *get_uctx(struct hvnd_dev *dev, u32 pid) -{ - struct hvnd_ucontext *uctx; - unsigned long flags; - - spin_lock_irqsave(&dev->id_lock, flags); - uctx = idr_find(&dev->uctxidr, pid); - spin_unlock_irqrestore(&dev->id_lock, flags); - - return uctx; -} - - -static inline void *map_irp_to_ctx(struct hvnd_dev *nd_dev, u32 irp) -{ - void *ctx; - unsigned long flags; - - spin_lock_irqsave(&nd_dev->id_lock, flags); - ctx = idr_find(&nd_dev->irpidr, irp); - spin_unlock_irqrestore(&nd_dev->id_lock, flags); - - return ctx; -} - - - -void hvnd_callback(void *context); -int hvnd_negotiate_version(struct hvnd_dev *nd_dev); -int hvnd_init_resources(struct hvnd_dev *nd_dev); -int hvnd_bind_nic(struct hvnd_dev *nd_dev, bool un_bind, char *ip_addr, char *mac_addr); -int hvnd_open_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx); -int hvnd_close_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx); -int hvnd_query_adaptor(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx); -int hvnd_create_pd(struct hvnd_ucontext *uctx, struct hvnd_dev *nd_dev, - struct hvnd_ib_pd *hvnd_pd); - -/* - * CQ operations. - */ -int hvnd_create_cq(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - struct hvnd_cq *cq); -int hvnd_destroy_cq(struct hvnd_dev *nd_dev, struct hvnd_cq *cq); -int hvnd_notify_cq(struct hvnd_dev *nd_dev, struct hvnd_cq *cq, - u32 notify_type, u64 irp_handle); - -/* - * QP operations. - */ -int hvnd_create_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - struct hvnd_qp *qp); - -int hvnd_free_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - struct hvnd_qp *qp); - -int hvnd_flush_qp(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - struct hvnd_qp *qp); - -/* - * MR operations. - */ - -int hvnd_cr_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 pd_handle, u64 *mr_handle); - -int hvnd_free_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 handle); - -int hvnd_mr_register(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - struct hvnd_mr *mr); -int hvnd_deregister_mr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 handle); - -/* - * Listner operations - */ -int hvnd_cr_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, u64 *handle); - - -int hvnd_free_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 listener_handle); - -int hvnd_bind_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 listener_handle, union nd_sockaddr_inet *addr); - -int hvnd_listen_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 listener_handle, u32 backlog); - -int hvnd_get_addr_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 listener_handle, union nd_sockaddr_inet *addr); - -int hvnd_get_connection_listener(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 listener_handle, u64 connector_handle, - u64 irp_handle); - -/* - * Connector operations. - */ -int hvnd_cr_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 *connector_handle); - -int hvnd_free_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 handle); - -int hvnd_cancelio_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 handle); -int hvnd_bind_connector(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 handle, union nd_sockaddr_inet *addr); - -int hvnd_connector_connect(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 connector_handle, u32 in_rd_limit, u32 out_rd_limit, - u32 priv_data_length, const u8 *priv_data, - u64 qp_handle, struct if_physical_addr *phys_addr, - union nd_sockaddr_inet *dest_addr, struct hvnd_ep_obj *ep); - -int hvnd_connector_complete_connect(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 connector_handle, enum ibv_qp_state *qp_state); - -int hvnd_connector_accept(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 connector_handle, - u64 qp_handle, - u32 in_rd_limit, u32 out_rd_limit, - u32 priv_data_length, const u8 *priv_data, - enum ibv_qp_state *qp_state, struct hvnd_ep_obj *ep); - -int hvnd_connector_reject(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 connector_handle, - u32 priv_data_length, u8 *priv_data, - enum ibv_qp_state *qp_state); - -int hvnd_connector_get_rd_limits(struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx, - u64 connector_handle, - struct nd_read_limits *rd_limits); - -int hvnd_connector_get_priv_data(struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx, - u64 connector_handle, - u8 *priv_data); - -int hvnd_connector_get_peer_addr(struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx, - u64 connector_handle, - union nd_sockaddr_inet *peer_addr); - -int hvnd_connector_get_local_addr(struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx, - u64 connector_handle, - union nd_sockaddr_inet *local_addr); - -int hvnd_connector_notify_disconnect(struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx, - u64 connector_handle, struct hvnd_ep_obj *ep); - - -int hvnd_connector_disconnect(struct hvnd_dev *nd_dev, - struct hvnd_ucontext *uctx, - u64 connector_handle, struct hvnd_ep_obj *ep); - -int hvnd_free_handle(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - u64 handle, u32 ioctl); - -int hvnd_cancel_io(struct hvnd_ep_obj *ep_object); - -char *hvnd_get_op_name(int ioctl); - -void hvnd_acquire_uctx_ref(struct hvnd_ucontext *uctx); -void hvnd_drop_uctx_ref(struct hvnd_dev *nd_dev,struct hvnd_ucontext *uctx); -void hvnd_process_events(struct work_struct *work); - -void hvnd_process_cq_event_pending(struct hvnd_ep_obj *ep, int status); -void hvnd_process_cq_event_complete(struct hvnd_ep_obj *ep, int status); -void hvnd_process_connector_accept(struct hvnd_ep_obj *ep_object, int status); -void hvnd_process_notify_disconnect(struct hvnd_ep_obj *ep_object, int status); -void hvnd_process_disconnect(struct hvnd_ep_obj *ep_object, int status); - -void put_irp_handle(struct hvnd_dev *nd_dev, u32 irp); -int get_irp_handle(struct hvnd_dev *nd_dev, u32 *local, void *irp_ctx); - -void hvnd_init_hdr(struct ndv_packet_hdr_control_1 *hdr, - u32 data_sz, u32 local, u32 remote, - u32 ioctl_code, - u32 ext_data_sz, u32 ext_data_offset, - u64 irp_handle); - -int hvnd_send_ioctl_pkt(struct hvnd_dev *nd_dev, - struct ndv_packet_hdr_control_1 *hdr, - u32 pkt_size, u64 cookie); - -int hvnd_get_outgoing_rdma_addr(struct hvnd_dev *nd_dev, struct hvnd_ucontext *uctx, - union nd_sockaddr_inet *og_addr); - -int hvnd_get_neigh_mac_addr(struct sockaddr *local, struct sockaddr *remote, char *mac_addr); - -void hvnd_addr_init(void); - -void hvnd_addr_deinit(void); - -bool ep_add_work_pending(struct hvnd_ep_obj *ep_object); -void ep_del_work_pending(struct hvnd_ep_obj *ep_object); -void ep_stop(struct hvnd_ep_obj *ep_object); - -#define current_pid() (current->pid) -/* - * NT STATUS defines. - */ - -#define STATUS_SUCCESS 0x0 -#define STATUS_PENDING 0x00000103 -#define STATUS_CANCELLED 0xC0000120 -#define STATUS_DISCONNECTED 0xC000020C -#define STATUS_TIMEOUT 0xC00000B5 - -void inc_ioctl_counter_request(unsigned ioctl); -void inc_ioctl_counter_response(unsigned ioctl); - -#define NDV_PROTOCOL_VAERSION_INVALID -1 -#define NDV_PACKET_INIT_SIZE 16 /* Size of the INIT packet */ - -#define HVND_RING_SZ (PAGE_SIZE * 64) - -/* logging levels */ -#define HVND_ERROR 0 -#define HVND_WARN 1 -#define HVND_INFO 2 -#define HVND_DEBUG 3 - -extern int hvnd_log_level; - -#define hvnd_error(fmt, args...) hvnd_log(HVND_ERROR, fmt, ##args) -#define hvnd_warn(fmt, args...) hvnd_log(HVND_WARN, fmt, ##args) -#define hvnd_info(fmt, args...) hvnd_log(HVND_INFO, fmt, ##args) -#define hvnd_debug(fmt, args...) hvnd_log(HVND_DEBUG, fmt, ##args) - -#define hvnd_log(level, fmt, args...) \ -do { \ - if (unlikely(hvnd_log_level >= (level))) \ - printk(KERN_ERR "hvnd %s[%u]: " fmt, __func__, __LINE__, ##args); \ -} while (0) - -#endif /* _VMBUS_RDMA_H */
BugLink: http://bugs.launchpad.net/bugs/1701744 Azure host OS has major and minor version numbers. Rename the following modules: drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/vmbus-rdma-142.ko drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/vmbus-rdma-144.ko To: drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/vmbus-rdma-142.0.ko drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/vmbus-rdma-144.0.ko Signed-off-by: Marcelo Henrique Cerri <marcelo.cerri@canonical.com> --- drivers/infiniband/hw/vmbus-rdma/Makefile | 4 +- .../hw/vmbus-rdma/vmbus-rdma-142.0/COPYING | 339 +++ .../hw/vmbus-rdma/vmbus-rdma-142.0/Makefile | 3 + .../hw/vmbus-rdma/vmbus-rdma-142.0/hvnd_addr.c | 136 + .../hw/vmbus-rdma/vmbus-rdma-142.0/mx_abi.h | 232 ++ .../hw/vmbus-rdma/vmbus-rdma-142.0/provider.c | 2961 +++++++++++++++++++ .../hw/vmbus-rdma/vmbus-rdma-142.0/user.h | 107 + .../hw/vmbus-rdma/vmbus-rdma-142.0/vmbus_rdma.c | 3074 ++++++++++++++++++++ .../hw/vmbus-rdma/vmbus-rdma-142.0/vmbus_rdma.h | 2202 ++++++++++++++ .../hw/vmbus-rdma/vmbus-rdma-142/COPYING | 339 --- .../hw/vmbus-rdma/vmbus-rdma-142/Makefile | 3 - .../hw/vmbus-rdma/vmbus-rdma-142/hvnd_addr.c | 136 - .../hw/vmbus-rdma/vmbus-rdma-142/mx_abi.h | 232 -- .../hw/vmbus-rdma/vmbus-rdma-142/provider.c | 2961 ------------------- .../infiniband/hw/vmbus-rdma/vmbus-rdma-142/user.h | 107 - .../hw/vmbus-rdma/vmbus-rdma-142/vmbus_rdma.c | 3074 -------------------- .../hw/vmbus-rdma/vmbus-rdma-142/vmbus_rdma.h | 2202 -------------- .../hw/vmbus-rdma/vmbus-rdma-144.0/COPYING | 339 +++ .../hw/vmbus-rdma/vmbus-rdma-144.0/Makefile | 3 + .../hw/vmbus-rdma/vmbus-rdma-144.0/hvnd_addr.c | 136 + .../hw/vmbus-rdma/vmbus-rdma-144.0/mx_abi.h | 232 ++ .../hw/vmbus-rdma/vmbus-rdma-144.0/provider.c | 2961 +++++++++++++++++++ .../hw/vmbus-rdma/vmbus-rdma-144.0/user.h | 107 + .../hw/vmbus-rdma/vmbus-rdma-144.0/vmbus_rdma.c | 3074 ++++++++++++++++++++ .../hw/vmbus-rdma/vmbus-rdma-144.0/vmbus_rdma.h | 2202 ++++++++++++++ .../hw/vmbus-rdma/vmbus-rdma-144/COPYING | 339 --- .../hw/vmbus-rdma/vmbus-rdma-144/Makefile | 3 - .../hw/vmbus-rdma/vmbus-rdma-144/hvnd_addr.c | 136 - .../hw/vmbus-rdma/vmbus-rdma-144/mx_abi.h | 232 -- .../hw/vmbus-rdma/vmbus-rdma-144/provider.c | 2961 ------------------- .../infiniband/hw/vmbus-rdma/vmbus-rdma-144/user.h | 107 - .../hw/vmbus-rdma/vmbus-rdma-144/vmbus_rdma.c | 3074 -------------------- .../hw/vmbus-rdma/vmbus-rdma-144/vmbus_rdma.h | 2202 -------------- 33 files changed, 18110 insertions(+), 18110 deletions(-) create mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/COPYING create mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/Makefile create mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/hvnd_addr.c create mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/mx_abi.h create mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/provider.c create mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/user.h create mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/vmbus_rdma.c create mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142.0/vmbus_rdma.h delete mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/COPYING delete mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/Makefile delete mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/hvnd_addr.c delete mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/mx_abi.h delete mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/provider.c delete mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/user.h delete mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/vmbus_rdma.c delete mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-142/vmbus_rdma.h create mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/COPYING create mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/Makefile create mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/hvnd_addr.c create mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/mx_abi.h create mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/provider.c create mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/user.h create mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/vmbus_rdma.c create mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144.0/vmbus_rdma.h delete mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/COPYING delete mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/Makefile delete mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/hvnd_addr.c delete mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/mx_abi.h delete mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/provider.c delete mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/user.h delete mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/vmbus_rdma.c delete mode 100644 drivers/infiniband/hw/vmbus-rdma/vmbus-rdma-144/vmbus_rdma.h