linux
5 years agoMerge branch 'mlx5-next' into rdma.git
Jason Gunthorpe [Thu, 20 Dec 2018 20:24:50 +0000 (13:24 -0700)]
Merge branch 'mlx5-next' into rdma.git

From git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux

mlx5 updates taken for dependencies on following patches.

* branche 'mlx5-next': (23 commits)
  IB/mlx5: Introduce uid as part of alloc/dealloc transport domain
  net/mlx5: Add shared Q counter bits
  net/mlx5: Continue driver initialization despite debugfs failure
  net/mlx5: Fold the modify lag code into function
  net/mlx5: Add lag affinity info to log
  net/mlx5: Split the activate lag function into two routines
  net/mlx5: E-Switch, Introduce flow counter affinity
  IB/mlx5: Unify e-switch representors load approach between uplink and VFs
  net/mlx5: Use lowercase 'X' for hex values
  net/mlx5: Remove duplicated include from eswitch.c
  net/mlx5: Remove the get protocol device interface entry
  net/mlx5: Support extended destination format in flow steering command
  net/mlx5: E-Switch, Change vhca id valid bool field to bit flag
  net/mlx5: Introduce extended destination fields
  net/mlx5: Revise gre and nvgre key formats
  net/mlx5: Add monitor commands layout and event data
  net/mlx5: Add support for plugged-disabled cable status in PME
  net/mlx5: Add support for PCIe power slot exceeded error in PME
  net/mlx5: Rework handling of port module events
  net/mlx5: Move flow counters data structures from flow steering header
  ...

5 years agoIB/mlx5: Introduce uid as part of alloc/dealloc transport domain
Yishai Hadas [Wed, 19 Dec 2018 14:28:10 +0000 (16:28 +0200)]
IB/mlx5: Introduce uid as part of alloc/dealloc transport domain

Introduce uid as part of alloc/dealloc transport domain to match the
device specification.

Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agoRDMA/bnxt_re: Increase depth of control path command queue
Devesh Sharma [Wed, 12 Dec 2018 09:56:24 +0000 (01:56 -0800)]
RDMA/bnxt_re: Increase depth of control path command queue

Increasing the depth of control path command queue to 8K entries to handle
burst of commands. This feature needs support from FW and the driver/fw
compatibility is checked from the interface version number.

Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/bnxt_re: Query HWRM Interface version from FW
Selvin Xavier [Wed, 12 Dec 2018 09:56:23 +0000 (01:56 -0800)]
RDMA/bnxt_re: Query HWRM Interface version from FW

Get HWRM interface major, minor, build and patch version from FW for
checking the FW/Driver compatibility.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoIB/usnic: Fix potential deadlock
Parvi Kaustubhi [Tue, 11 Dec 2018 22:15:42 +0000 (14:15 -0800)]
IB/usnic: Fix potential deadlock

Acquiring the rtnl lock while holding usdev_lock could result in a
deadlock.

For example:

usnic_ib_query_port()
| mutex_lock(&us_ibdev->usdev_lock)
 | ib_get_eth_speed()
  | rtnl_lock()

rtnl_lock()
| usnic_ib_netdevice_event()
 | mutex_lock(&us_ibdev->usdev_lock)

This commit moves the usdev_lock acquisition after the rtnl lock has been
released.

This is safe to do because usdev_lock is not protecting anything being
accessed in ib_get_eth_speed(). Hence, the correct order of holding locks
(rtnl -> usdev_lock) is not violated.

Signed-off-by: Parvi Kaustubhi <pkaustub@cisco.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/bnxt_re: Make use of destroy AH sleepable flag
Gal Pressman [Wed, 12 Dec 2018 09:09:08 +0000 (11:09 +0200)]
RDMA/bnxt_re: Make use of destroy AH sleepable flag

When in a sleepable (non-atomic) context, wait for firmware completion
instead of polling for it.

Signed-off-by: Gal Pressman <galpress@amazon.com>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/bnxt_re: Make use of create AH sleepable flag
Gal Pressman [Wed, 12 Dec 2018 09:09:07 +0000 (11:09 +0200)]
RDMA/bnxt_re: Make use of create AH sleepable flag

When in a sleepable (non-atomic) context, wait for firmware completion
instead of polling for it.

Signed-off-by: Gal Pressman <galpress@amazon.com>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA: Mark if destroy address handle is in a sleepable context
Gal Pressman [Wed, 12 Dec 2018 09:09:06 +0000 (11:09 +0200)]
RDMA: Mark if destroy address handle is in a sleepable context

Introduce a 'flags' field to destroy address handle callback and add a
flag that marks whether the callback is executed in an atomic context or
not.

This will allow drivers to wait for completion instead of polling for it
when it is allowed.

Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA: Mark if create address handle is in a sleepable context
Gal Pressman [Wed, 12 Dec 2018 09:09:05 +0000 (11:09 +0200)]
RDMA: Mark if create address handle is in a sleepable context

Introduce a 'flags' field to create address handle callback and add a flag
that marks whether the callback is executed in an atomic context or not.

This will allow drivers to wait for completion instead of polling for it
when it is allowed.

Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/srpt: Add support for immediate data
Bart Van Assche [Mon, 17 Dec 2018 21:20:46 +0000 (13:20 -0800)]
RDMA/srpt: Add support for immediate data

Modify allocation of the non-SRQ receive queues such that immediate
data is aligned on a 512 byte boundary. That alignment is necessary
to pass the immediate data without copying to the block layer. When
receiving an SRP_CMD with immediate data, postpone the ib_post_recv()
call until target_execute_cmd() has finished. See also
srpt_release_cmd().

Cc: Sergey Gorenko <sergeygo@mellanox.com>
Cc: Max Gurtovoy <maxg@mellanox.com>
Cc: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA/srpt: Rework the srpt_alloc_srq() error path
Bart Van Assche [Mon, 17 Dec 2018 21:20:45 +0000 (13:20 -0800)]
RDMA/srpt: Rework the srpt_alloc_srq() error path

This patch does not change any functionality but makes the next patch
easier to read.

Cc: Sergey Gorenko <sergeygo@mellanox.com>
Cc: Max Gurtovoy <maxg@mellanox.com>
Cc: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA/srpt: Remove driver version and release date
Bart Van Assche [Mon, 17 Dec 2018 21:20:44 +0000 (13:20 -0800)]
RDMA/srpt: Remove driver version and release date

Neither a driver version number nor a release data is useful in
an upstream driver. Remove the word "InfiniBand" from the driver
description because recently RoCE support has been added to this
driver.

Cc: Sergey Gorenko <sergeygo@mellanox.com>
Cc: Max Gurtovoy <maxg@mellanox.com>
Cc: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA/srpt: Make kernel-doc headers complete
Bart Van Assche [Mon, 17 Dec 2018 21:20:43 +0000 (13:20 -0800)]
RDMA/srpt: Make kernel-doc headers complete

Add documentation for those structure members for which it is missing.

Cc: Sergey Gorenko <sergeygo@mellanox.com>
Cc: Max Gurtovoy <maxg@mellanox.com>
Cc: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA/srpt: Join split strings
Bart Van Assche [Mon, 17 Dec 2018 21:20:42 +0000 (13:20 -0800)]
RDMA/srpt: Join split strings

Make sure that long strings occur on a single line as required by the
coding standard.

Cc: Sergey Gorenko <sergeygo@mellanox.com>
Cc: Max Gurtovoy <maxg@mellanox.com>
Cc: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA/srpt: Improve coding style conformance
Bart Van Assche [Mon, 17 Dec 2018 21:20:41 +0000 (13:20 -0800)]
RDMA/srpt: Improve coding style conformance

Use tabs instead of spaces for indentation. Make sure that multi-line
expressions have the operator at the end of a line instead of the start.
Avoid a complaint about a missing space in a ternary expression by
changing '(boolean) ? 1: 0' into 'boolean'.

Cc: Sergey Gorenko <sergeygo@mellanox.com>
Cc: Max Gurtovoy <maxg@mellanox.com>
Cc: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA/srpt: Fix a use-after-free in the channel release code
Bart Van Assche [Mon, 17 Dec 2018 21:20:40 +0000 (13:20 -0800)]
RDMA/srpt: Fix a use-after-free in the channel release code

This patch avoids that KASAN sporadically reports the following:

BUG: KASAN: use-after-free in rxe_run_task+0x1e/0x60 [rdma_rxe]
Read of size 1 at addr ffff88801c50d8f4 by task check/24830

CPU: 4 PID: 24830 Comm: check Not tainted 4.20.0-rc6-dbg+ #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
Call Trace:
 dump_stack+0x86/0xca
 print_address_description+0x71/0x239
 kasan_report.cold.5+0x242/0x301
 __asan_load1+0x47/0x50
 rxe_run_task+0x1e/0x60 [rdma_rxe]
 rxe_post_send+0x4bd/0x8d0 [rdma_rxe]
 srpt_zerolength_write+0xe1/0x160 [ib_srpt]
 srpt_close_ch+0x8b/0xe0 [ib_srpt]
 srpt_set_enabled+0xe7/0x150 [ib_srpt]
 srpt_tpg_enable_store+0xc0/0x100 [ib_srpt]
 configfs_write_file+0x157/0x1d0
 __vfs_write+0xd7/0x3d0
 vfs_write+0x102/0x290
 ksys_write+0xab/0x130
 __x64_sys_write+0x43/0x50
 do_syscall_64+0x71/0x210
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Allocated by task 13856:
 save_stack+0x43/0xd0
 kasan_kmalloc+0xc7/0xe0
 kasan_slab_alloc+0x11/0x20
 kmem_cache_alloc+0x105/0x320
 rxe_alloc+0xff/0x1f0 [rdma_rxe]
 rxe_create_qp+0x9f/0x160 [rdma_rxe]
 ib_create_qp+0xf5/0x690 [ib_core]
 rdma_create_qp+0x6a/0x140 [rdma_cm]
 srpt_cm_req_recv.cold.59+0x1588/0x237b [ib_srpt]
 srpt_rdma_cm_req_recv.isra.35+0x1d5/0x220 [ib_srpt]
 srpt_rdma_cm_handler+0x6f/0x100 [ib_srpt]
 cma_listen_handler+0x59/0x60 [rdma_cm]
 cma_ib_req_handler+0xd5b/0x2570 [rdma_cm]
 cm_process_work+0x2e/0x110 [ib_cm]
 cm_work_handler+0x2aae/0x502b [ib_cm]
 process_one_work+0x481/0x9e0
 worker_thread+0x67/0x5b0
 kthread+0x1cf/0x1f0
 ret_from_fork+0x24/0x30

Freed by task 3440:
 save_stack+0x43/0xd0
 __kasan_slab_free+0x139/0x190
 kasan_slab_free+0xe/0x10
 kmem_cache_free+0xbc/0x330
 rxe_elem_release+0x66/0xe0 [rdma_rxe]
 rxe_destroy_qp+0x3f/0x50 [rdma_rxe]
 ib_destroy_qp+0x140/0x360 [ib_core]
 srpt_release_channel_work+0xdc/0x310 [ib_srpt]
 process_one_work+0x481/0x9e0
 worker_thread+0x67/0x5b0
 kthread+0x1cf/0x1f0
 ret_from_fork+0x24/0x30

Cc: Sergey Gorenko <sergeygo@mellanox.com>
Cc: Max Gurtovoy <maxg@mellanox.com>
Cc: Laurence Oberman <loberman@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA/srp: Add support for immediate data
Bart Van Assche [Mon, 17 Dec 2018 21:20:39 +0000 (13:20 -0800)]
RDMA/srp: Add support for immediate data

Request permission to send immediate data during login. If the SRP
target grants this request, send the payload of write requests <= 8 KB
as immediate data.

Cc: Sergey Gorenko <sergeygo@mellanox.com>
Cc: Max Gurtovoy <maxg@mellanox.com>
Cc: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA/srp: Rework handling of the maximum information unit length
Bart Van Assche [Mon, 17 Dec 2018 21:20:38 +0000 (13:20 -0800)]
RDMA/srp: Rework handling of the maximum information unit length

Move the maximum initiator to target information unit length parameter
from struct srp_target_port into struct srp_rdma_ch. This patch does
not change any functionality but makes the next patch easier to read.

Cc: Sergey Gorenko <sergeygo@mellanox.com>
Cc: Max Gurtovoy <maxg@mellanox.com>
Cc: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA/srp: Move srp_rdma_ch.max_ti_iu_len declaration
Bart Van Assche [Mon, 17 Dec 2018 21:20:37 +0000 (13:20 -0800)]
RDMA/srp: Move srp_rdma_ch.max_ti_iu_len declaration

Since srp_rdma_ch.max_ti_iu_len is used in the hot path, move it to the
section with data structure members used in the hot path.

Cc: Sergey Gorenko <sergeygo@mellanox.com>
Cc: Max Gurtovoy <maxg@mellanox.com>
Cc: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA/srp: Propagate ib_post_send() failures to the SCSI mid-layer
Bart Van Assche [Mon, 17 Dec 2018 21:20:36 +0000 (13:20 -0800)]
RDMA/srp: Propagate ib_post_send() failures to the SCSI mid-layer

This patch avoids that the SCSI mid-layer keeps retrying forever if
ib_post_send() fails. This was discovered while testing immediate
data support and passing a too large num_sge value to ib_post_send().

Cc: Sergey Gorenko <sergeygo@mellanox.com>
Cc: Max Gurtovoy <maxg@mellanox.com>
Cc: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA/srp: Handle large SCSI CDBs correctly
Bart Van Assche [Mon, 17 Dec 2018 21:20:35 +0000 (13:20 -0800)]
RDMA/srp: Handle large SCSI CDBs correctly

Reserve additional space for CDBs that contain more than sixteen bytes
and set the add_cdb_len field for such CDBs as required. From the SRP
standard: "The ADDITIONAL CDB LENGTH field contains the length in
dwords of the ADDITIONAL CDB field."

Cc: Sergey Gorenko <sergeygo@mellanox.com>
Cc: Max Gurtovoy <maxg@mellanox.com>
Cc: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoRDMA/srp: Document srp_parse_in() arguments
Bart Van Assche [Mon, 17 Dec 2018 21:20:34 +0000 (13:20 -0800)]
RDMA/srp: Document srp_parse_in() arguments

This patch avoids that a warning is reported when building with W=1.

Cc: Sergey Gorenko <sergeygo@mellanox.com>
Cc: Max Gurtovoy <maxg@mellanox.com>
Cc: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoinclude/scsi/srp.h: Add support for immediate data
Bart Van Assche [Mon, 17 Dec 2018 21:20:33 +0000 (13:20 -0800)]
include/scsi/srp.h: Add support for immediate data

Add constants and data structures to support immediate data. These
changes conform to SRP2r04.

Cc: Sergey Gorenko <sergeygo@mellanox.com>
Cc: Max Gurtovoy <maxg@mellanox.com>
Cc: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoinclude/scsi/srp.h: Move response flag definitions into this file
Bart Van Assche [Mon, 17 Dec 2018 21:20:32 +0000 (13:20 -0800)]
include/scsi/srp.h: Move response flag definitions into this file

This patch moves all constants that come from the SRP standard into the
include/scsi/srp.h header file.

Cc: Sergey Gorenko <sergeygo@mellanox.com>
Cc: Max Gurtovoy <maxg@mellanox.com>
Cc: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agoIB/mlx5: Fix compile issue when ODP disabled
Doug Ledford [Wed, 19 Dec 2018 18:43:17 +0000 (13:43 -0500)]
IB/mlx5: Fix compile issue when ODP disabled

When CONFIG_INFINIBAND_ON_DEMAND_PAGING is not enabled, we were getting
build failures for defined but not used code.  Fix that.

Fixes: 813e90b1aeaa ("IB/mlx5: Add advise_mr() support")
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agonet/mlx5: Add shared Q counter bits
Leon Romanovsky [Thu, 6 Dec 2018 12:40:11 +0000 (14:40 +0200)]
net/mlx5: Add shared Q counter bits

Updated HW specification file with needed bits to allow
sharing of Q counters between DEVX contexts and kernel.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agoRDMA: Cleanup undesired pd->uobject usage
Shamir Rabinovitch [Mon, 17 Dec 2018 15:15:18 +0000 (17:15 +0200)]
RDMA: Cleanup undesired pd->uobject usage

Drivers should be using udata to determine if a method is invoked from
user space or kernel space. A pd does not necessarily say a different
objects is kernel or user.

Transforming the tests to use udata eliminates a large number of uobject
references from the drivers.

Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/restrack: Resource-tracker should not use uobject pointers
Shamir Rabinovitch [Mon, 17 Dec 2018 15:15:16 +0000 (17:15 +0200)]
RDMA/restrack: Resource-tracker should not use uobject pointers

Having uobject pointer embedded in ib core objects is not aligned with a
future shared ib_x model. The resource tracker only does this to keep
track of user/kernel objects - track this directly instead.

Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoIB/mlx5: Add advise_mr() support
Moni Shoua [Tue, 11 Dec 2018 11:37:53 +0000 (13:37 +0200)]
IB/mlx5: Add advise_mr() support

The verb advise_mr() is used to give advice to the kernel about an address
range that belongs to a MR.  Implement the verb and register it on the
device. The current implementation supports the only known advice to date,
prefetch.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoIB/uverbs: Add support to advise_mr
Moni Shoua [Tue, 11 Dec 2018 11:37:52 +0000 (13:37 +0200)]
IB/uverbs: Add support to advise_mr

Add new ioctl method for the MR object - ADVISE_MR.

This command can be used by users to give an advice or directions to the
kernel about an address range that belongs to memory regions.

A new ib_device callback, advise_mr(), is introduced here to suupport the
new command. This command takes the following arguments:

- pd: The protection domain to which all memory regions belong
- advice:  The type of the advice
   * IB_UVERBS_ADVISE_MR_ADVICE_PREFETCH - Pre-fetch a range of
an on-demand paging MR
   * IB_UVERBS_ADVISE_MR_ADVICE_PREFETCH_WRITE - Pre-fetch a range
of an on-demand paging MR with write intention
- flags: The properties of the advice
* IB_UVERBS_ADVISE_MR_FLAG_FLUSH - Operation must end before
return to the caller
- sg_list: The list of memory ranges
- num_sge: The number of memory ranges in the list
- attrs: More attributes to be parsed by the provider

Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoIB/uverbs: Add helper to get array size from ptr attribute
Moni Shoua [Tue, 11 Dec 2018 11:37:51 +0000 (13:37 +0200)]
IB/uverbs: Add helper to get array size from ptr attribute

When the parser of an ioctl command has the knowledge that a ptr attribute
in a bundle represents an array of structures, it is useful for it to know
the number of elements in the array. This is done by dividing the
attribute length with the element size.

Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/uverbs: Add an ioctl method to destroy an object
Parav Pandit [Fri, 30 Nov 2018 11:16:48 +0000 (13:16 +0200)]
RDMA/uverbs: Add an ioctl method to destroy an object

Add an ioctl method to destroy the PD, MR, MW, AH, flow, RWQ indirection
table and XRCD objects by handle which doesn't require any output response
during destruction.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/uverbs: Add a method to introspect handles in a context
Jason Gunthorpe [Fri, 30 Nov 2018 11:16:47 +0000 (13:16 +0200)]
RDMA/uverbs: Add a method to introspect handles in a context

Introduce a helper function gather_objects_handle() to copy object handles
under a spin lock.

Expose these objects handles via the uverbs ioctl interface.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
5 years agoIB/mlx4: Utilize macro to calculate SQ spare size
Yuval Shaia [Tue, 11 Dec 2018 08:36:47 +0000 (10:36 +0200)]
IB/mlx4: Utilize macro to calculate SQ spare size

The macro MLX4_IB_SQ_HEADROOM calculates the spare room needed to be
left. Use it instead of hard-coding the HW prefetch size.

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/hns: Fix an error code in hns_roce_create_srq()
Dan Carpenter [Mon, 17 Dec 2018 07:08:15 +0000 (10:08 +0300)]
RDMA/hns: Fix an error code in hns_roce_create_srq()

The function accidentally returns success on this error path.

Fixes: c7bcb13442e1 ("RDMA/hns: Add SRQ support for hip08 kernel mode")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoIB/qib: Fix an error code in qib_sdma_verbs_send()
Dan Carpenter [Mon, 17 Dec 2018 07:05:36 +0000 (10:05 +0300)]
IB/qib: Fix an error code in qib_sdma_verbs_send()

We accidentally return success on this error path.

Fixes: f931551bafe1 ("IB/qib: Add new qib driver for QLogic PCIe InfiniBand adapters")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/core: Delete RoCE GID in hw when corresponding IP is deleted
Parav Pandit [Tue, 18 Dec 2018 12:16:00 +0000 (14:16 +0200)]
RDMA/core: Delete RoCE GID in hw when corresponding IP is deleted

Currently a RoCE GID entry is removed from the hardware when all
references to the GID entry drop to zero. This is a change in behavior
from before the fixed patch. The GID entry should be removed from the
hardware when GID entry deletion is requested. This allows the driver
terminate ongoing traffic through the RoCE GID.

While a GID is deleted from the hardware, GID slot in the software GID
cache is not freed. GID slot is freed once all references of such GID are
dropped. This continue to ensure that such GID slot of hardware is not
allocated to new GID entry allocation request. It is allocated once all
references to GID entry drop.

This approach allows drivers to put a tombestone of some kind on the HW
GID index to block the traffic.

Fixes: b150c3862d21 ("IB/core: Introduce GID entry reference counts")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/mlx5: Fix function name typo 'fileds' -> 'fields'
Gal Pressman [Tue, 18 Dec 2018 15:57:32 +0000 (17:57 +0200)]
RDMA/mlx5: Fix function name typo 'fileds' -> 'fields'

Fix typo in 'set_mr_fileds' -> 'set_mr_fields'.

Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/i40iw: Make sure to initialize ib_device_ops
Kamal Heib [Tue, 18 Dec 2018 20:55:07 +0000 (22:55 +0200)]
RDMA/i40iw: Make sure to initialize ib_device_ops

The initialization of the ib_device_ops was dropped by mistake when
rebasing the ib_device_ops series, this patch fixes that.

Fixes: 15644f57cb66 ("RDMA/i40iw: Initialize ib_device_ops struct")
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/mlx5: Delete unreachable handle_atomic code by simplifying SW completion
Leon Romanovsky [Wed, 12 Dec 2018 17:45:53 +0000 (19:45 +0200)]
RDMA/mlx5: Delete unreachable handle_atomic code by simplifying SW completion

Handle atomic was left as unimplemented from 2013, remove the code till
this part will be developed.

Remove the dead code by simplifying SW completion logic which is supposed
to be the same for send and receive paths.

Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Tested-by: Stephen Rothwell <sfr@canb.auug.org.au> # compile tested
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/uverbs: Implement an ioctl that can call write and write_ex handlers
Jason Gunthorpe [Fri, 30 Nov 2018 11:06:21 +0000 (13:06 +0200)]
RDMA/uverbs: Implement an ioctl that can call write and write_ex handlers

Now that the handlers do not process their own udata we can make a
sensible ioctl that wrappers them. The ioctl follows the same format as
the write_ex() and has the user explicitly specify the core and driver
in/out opaque structures and a command number.

This works for all forms of write commands.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
5 years agonet/mlx5: Continue driver initialization despite debugfs failure
Leon Romanovsky [Thu, 13 Dec 2018 11:15:11 +0000 (13:15 +0200)]
net/mlx5: Continue driver initialization despite debugfs failure

The failure to create debugfs entry is unpleasant event, but not enough
to abort drier initialization. Align the mlx5_core code to debugfs design
and continue execution whenever debugfs_create_dir() successes or not.

Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Fold the modify lag code into function
Shahar Klein [Thu, 13 Dec 2018 03:11:41 +0000 (19:11 -0800)]
net/mlx5: Fold the modify lag code into function

Handle the code of modifying the lag affinity within a separate function.

Signed-off-by: Shahar Klein <shahark@mellanox.com>
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Aviv Heller <avivh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Add lag affinity info to log
Roi Dayan [Thu, 13 Dec 2018 03:11:40 +0000 (19:11 -0800)]
net/mlx5: Add lag affinity info to log

Report the initial LAG port affinity upon LAG creation.

Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Shahar Klein <shahark@mellanox.com>
Reviewed-by: Aviv Heller <avivh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Split the activate lag function into two routines
Roi Dayan [Thu, 13 Dec 2018 03:11:39 +0000 (19:11 -0800)]
net/mlx5: Split the activate lag function into two routines

Split the activate lag function in order to be symmetric with
the deactivate lag function.

Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Shahar Klein <shahark@mellanox.com>
Reviewed-by: Aviv Heller <avivh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: E-Switch, Introduce flow counter affinity
Shahar Klein [Thu, 13 Dec 2018 03:11:38 +0000 (19:11 -0800)]
net/mlx5: E-Switch, Introduce flow counter affinity

This dictates the device affinity for eswitch flow counters, set by the FW
according to the HW device capabilities.

Under "source eswitch" affinity, the counter should be allocated on the
device related to the source vport in the match. This covers both non
merged e-switch mode as well as old FW that does not advertise this cap.

Under "flow eswitch" affinity, the counter should be allocated on the
device where the eswitch rule is set.

Signed-off-by: Shahar Klein <shahark@mellanox.com>
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agoIB/mlx5: Unify e-switch representors load approach between uplink and VFs
Mark Bloch [Thu, 13 Dec 2018 03:11:37 +0000 (19:11 -0800)]
IB/mlx5: Unify e-switch representors load approach between uplink and VFs

When in switchdev mode and the add function is called by the core
level driver, make sure we only register the callbacks, but don't
create the mlx5 IB device or initialize anything. With this change
all the IB devices in switchdev mode are created only once the load
callback is invoked by the e-switch core sub-module. This follows
the design paradigm under which the all the Eth representors must
be loaded before any of IB reprs is loaded.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Acked-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Use lowercase 'X' for hex values
Saeed Mahameed [Thu, 13 Dec 2018 03:11:36 +0000 (19:11 -0800)]
net/mlx5: Use lowercase 'X' for hex values

Apparently gcc is cool with upper case '0X' but it is not commonly used.
Replace '0X' with lowercase '0x' in mlx5_ifc.h file.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Remove duplicated include from eswitch.c
YueHaibing [Wed, 12 Dec 2018 08:03:38 +0000 (08:03 +0000)]
net/mlx5: Remove duplicated include from eswitch.c

Remove duplicated include.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agoMAINTAINERS: Update usnic driver maintainers
Parvi Kaustubhi [Tue, 11 Dec 2018 22:15:43 +0000 (14:15 -0800)]
MAINTAINERS: Update usnic driver maintainers

Add Nelson Escobar and myself as maintainers for
drivers/infiniband/hw/usnic

Signed-off-by: Parvi Kaustubhi <pkaustub@cisco.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA: Start use ib_device_ops
Kamal Heib [Mon, 10 Dec 2018 19:09:48 +0000 (21:09 +0200)]
RDMA: Start use ib_device_ops

Make all the required change to start use the ib_device_ops structure.

Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/rdmavt: Initialize ib_device_ops struct
Kamal Heib [Mon, 10 Dec 2018 19:09:47 +0000 (21:09 +0200)]
RDMA/rdmavt: Initialize ib_device_ops struct

Initialize ib_device_ops with the supported operations using
ib_set_device_ops() and remove the use of check_driver_override().

Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/rxe: Initialize ib_device_ops struct
Kamal Heib [Mon, 10 Dec 2018 19:09:46 +0000 (21:09 +0200)]
RDMA/rxe: Initialize ib_device_ops struct

Initialize ib_device_ops with the supported operations using
ib_set_device_ops().

Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/vmw_pvrdma: Initialize ib_device_ops struct
Kamal Heib [Mon, 10 Dec 2018 19:09:45 +0000 (21:09 +0200)]
RDMA/vmw_pvrdma: Initialize ib_device_ops struct

Initialize ib_device_ops with the supported operations using
ib_set_device_ops().

Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/usnic: Initialize ib_device_ops struct
Kamal Heib [Mon, 10 Dec 2018 19:09:44 +0000 (21:09 +0200)]
RDMA/usnic: Initialize ib_device_ops struct

Initialize ib_device_ops with the supported operations using
ib_set_device_ops().

Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/qib: Initialize ib_device_ops struct
Kamal Heib [Mon, 10 Dec 2018 19:09:43 +0000 (21:09 +0200)]
RDMA/qib: Initialize ib_device_ops struct

Initialize ib_device_ops with the supported operations using
ib_set_device_ops().

Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/qedr: Initialize ib_device_ops struct
Kamal Heib [Mon, 10 Dec 2018 19:09:42 +0000 (21:09 +0200)]
RDMA/qedr: Initialize ib_device_ops struct

Initialize ib_device_ops with the supported operations using
ib_set_device_ops().

Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/ocrdma: Initialize ib_device_ops struct
Kamal Heib [Mon, 10 Dec 2018 19:09:41 +0000 (21:09 +0200)]
RDMA/ocrdma: Initialize ib_device_ops struct

Initialize ib_device_ops with the supported operations using
ib_set_device_ops().

Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/nes: Initialize ib_device_ops struct
Kamal Heib [Mon, 10 Dec 2018 19:09:40 +0000 (21:09 +0200)]
RDMA/nes: Initialize ib_device_ops struct

Initialize ib_device_ops with the supported operations using
ib_set_device_ops().

Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/mthca: Initialize ib_device_ops struct
Kamal Heib [Mon, 10 Dec 2018 19:09:39 +0000 (21:09 +0200)]
RDMA/mthca: Initialize ib_device_ops struct

Initialize ib_device_ops with the supported operations using
ib_set_device_ops().

Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/mlx5: Initialize ib_device_ops struct
Kamal Heib [Mon, 10 Dec 2018 19:09:38 +0000 (21:09 +0200)]
RDMA/mlx5: Initialize ib_device_ops struct

Initialize ib_device_ops with the supported operations using
ib_set_device_ops().

Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/mlx4: Initialize ib_device_ops struct
Kamal Heib [Mon, 10 Dec 2018 19:09:37 +0000 (21:09 +0200)]
RDMA/mlx4: Initialize ib_device_ops struct

Initialize ib_device_ops with the supported operations using
ib_set_device_ops().

Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/i40iw: Initialize ib_device_ops struct
Kamal Heib [Mon, 10 Dec 2018 19:09:36 +0000 (21:09 +0200)]
RDMA/i40iw: Initialize ib_device_ops struct

Initialize ib_device_ops with the supported operations using
ib_set_device_ops().

Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/hns: Initialize ib_device_ops struct
Kamal Heib [Mon, 10 Dec 2018 19:09:35 +0000 (21:09 +0200)]
RDMA/hns: Initialize ib_device_ops struct

Initialize ib_device_ops with the supported operations using
ib_set_device_ops().

Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/hfi1: Initialize ib_device_ops struct
Kamal Heib [Mon, 10 Dec 2018 19:09:34 +0000 (21:09 +0200)]
RDMA/hfi1: Initialize ib_device_ops struct

Initialize ib_device_ops with the supported operations using
ib_set_device_ops().

Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/cxgb4: Initialize ib_device_ops struct
Kamal Heib [Mon, 10 Dec 2018 19:09:33 +0000 (21:09 +0200)]
RDMA/cxgb4: Initialize ib_device_ops struct

Initialize ib_device_ops with the supported operations using
ib_set_device_ops().

Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/cxgb3: Initialize ib_device_ops struct
Kamal Heib [Mon, 10 Dec 2018 19:09:32 +0000 (21:09 +0200)]
RDMA/cxgb3: Initialize ib_device_ops struct

Initialize ib_device_ops with the supported operations using
ib_set_device_ops().

Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/bnxt_re: Initialize ib_device_ops struct
Kamal Heib [Mon, 10 Dec 2018 19:09:31 +0000 (21:09 +0200)]
RDMA/bnxt_re: Initialize ib_device_ops struct

Initialize ib_device_ops with the supported operations using
ib_set_device_ops().

Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/core: Introduce ib_device_ops
Kamal Heib [Mon, 10 Dec 2018 19:09:30 +0000 (21:09 +0200)]
RDMA/core: Introduce ib_device_ops

This change introduces the ib_device_ops structure that defines all the
InfiniBand device operations in one place, so the code will be more
readable and clean, unlike today when the ops are mixed with ib_device
data members.

The providers will need to define the supported operations and assign them
using ib_set_device_ops(), that will also make the providers code more
readable and clean.

Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoIB/mlx4: Remove unneeded NULL check
Yuval Shaia [Tue, 11 Dec 2018 10:26:35 +0000 (12:26 +0200)]
IB/mlx4: Remove unneeded NULL check

NULL check for kfree is unnecessary, remove it.

Fixes: b42dde478bca ("IB/mlx4: Rework special QP creation error path")
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/ocrdma: Use PCI-ID as an identification in debugfs
Leon Romanovsky [Tue, 11 Dec 2018 10:04:42 +0000 (12:04 +0200)]
RDMA/ocrdma: Use PCI-ID as an identification in debugfs

In opposite to current implementation of identification based on device
name, PCI-ID identification provides stable names suitable for device
rename. Change ocrdma debugfs representation to use PCI-ID.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/uverbs: Optimize clearing of extra bytes in response
Leon Romanovsky [Tue, 11 Dec 2018 09:41:05 +0000 (11:41 +0200)]
RDMA/uverbs: Optimize clearing of extra bytes in response

Clear extra bytes in response in batch manner instead
of doing it per-byte.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/vmw_pvrdma: Use atomic memory allocation in create AH
Gal Pressman [Mon, 10 Dec 2018 15:17:25 +0000 (17:17 +0200)]
RDMA/vmw_pvrdma: Use atomic memory allocation in create AH

Create address handle callback should not sleep, use GFP_ATOMIC instead of
GFP_KERNEL for memory allocation.

Fixes: 29c8d9eba550 ("IB: Add vmw_pvrdma driver")
Cc: Adit Ranadive <aditr@vmware.com>
Signed-off-by: Gal Pressman <galpress@amazon.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoIB/{mlx5,ocrdma,qedr,rxe}: Omit port validation from IB verbs
Yuval Shaia [Sun, 9 Dec 2018 11:06:10 +0000 (13:06 +0200)]
IB/{mlx5,ocrdma,qedr,rxe}: Omit port validation from IB verbs

RDMA core layer already make sure port is valid, no need to check it here
again.

For the pkey validation this depends on commit b3ac5742fead ("RDMA/core:
Validate port number in query_pkey verb")

Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/mlx5: Fail early if user tries to create flows on IB representors
Leon Romanovsky [Mon, 10 Dec 2018 09:19:49 +0000 (11:19 +0200)]
RDMA/mlx5: Fail early if user tries to create flows on IB representors

IB representors don't support creation of RAW ethernet QP flows.  Disable
them by reusing existing RDMA/core support macros.  We do it for both
creation and matcher because latter is not usable if no flow creation is
available.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoIB/mlx5: Remove duplicated include from mlx5_ib.h
YueHaibing [Mon, 10 Dec 2018 07:27:59 +0000 (15:27 +0800)]
IB/mlx5: Remove duplicated include from mlx5_ib.h

Remove duplicated include.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoIB/mlx5: Add 2X width support to query_port
Michael Guralnik [Sun, 9 Dec 2018 09:49:54 +0000 (11:49 +0200)]
IB/mlx5: Add 2X width support to query_port

Report to the user 2x width over MAD interface.

Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoMerge tag 'v4.20-rc6' into rdma.git for-next
Jason Gunthorpe [Tue, 11 Dec 2018 21:24:57 +0000 (14:24 -0700)]
Merge tag 'v4.20-rc6' into rdma.git for-next

For dependencies in following patches.

5 years agoIB/mlx5: Add HDR speed support to query port
Michael Guralnik [Sun, 9 Dec 2018 09:49:52 +0000 (11:49 +0200)]
IB/mlx5: Add HDR speed support to query port

Report HDR speed when HDR is supported in CapabilityMask2 and the actual
speed is HDR.

Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoIB/mlx5: Report CapabilityMask2 in ib_query_port
Michael Guralnik [Sun, 9 Dec 2018 09:49:51 +0000 (11:49 +0200)]
IB/mlx5: Report CapabilityMask2 in ib_query_port

CapabilityMask2 exists when IB_PORT_CAP_MASK2_SUP is set in the original
capability mask. In such cases, query its value and report it in query
port.

Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoIB/core: Add new IB rates
Michael Guralnik [Sun, 9 Dec 2018 09:49:50 +0000 (11:49 +0200)]
IB/core: Add new IB rates

Add the new rates that were added to Infiniband spec as part of HDR and 2x
support.

Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoIB/core: Add 2X port width
Michael Guralnik [Sun, 9 Dec 2018 09:49:49 +0000 (11:49 +0200)]
IB/core: Add 2X port width

Add the new 2X port width that is part of IB spec 1.3

Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoIB/core: Add CapabilityMask2 to port attributes
Michael Guralnik [Sun, 9 Dec 2018 09:49:48 +0000 (11:49 +0200)]
IB/core: Add CapabilityMask2 to port attributes

CapabilityMask2 was added in IB Spec 1.3 under PortInfo attribute.  The
new Capapbility mask is needed in order to expose the new 2X width and HDR
speed.

Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Reviewed-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoIB/rxe: Fix incorrect cache cleanup in error flow
Yuval Shaia [Sun, 9 Dec 2018 13:53:49 +0000 (15:53 +0200)]
IB/rxe: Fix incorrect cache cleanup in error flow

Array iterator stays at the same slot, fix it.

Fixes: 8700e3e7c485 ("Soft RoCE driver")
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/hns: Bugfix for RoCE loopback test
Lijun Ou [Sat, 8 Dec 2018 10:40:11 +0000 (18:40 +0800)]
RDMA/hns: Bugfix for RoCE loopback test

This patch implements a cmdq to enable the loopback of ssu module
according to the modified hardware desgin.

The ssu consists of ingress unit, packet buffer and programmable packet
process unit. if the loopback bit of ssu is not enabled, the roce packet
with loopback bit will fail.

Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/hns: Update posting & querying mailbox
Lijun Ou [Sat, 8 Dec 2018 10:40:10 +0000 (18:40 +0800)]
RDMA/hns: Update posting & querying mailbox

This patch updates the implementation of the mailbox command interface by
using command queue instead of operating registers.  With this update, the
software can be well decoupled with the hardware.

Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/hns: Fix the bug while use multi-hop of pbl
Lijun Ou [Sat, 8 Dec 2018 10:40:09 +0000 (18:40 +0800)]
RDMA/hns: Fix the bug while use multi-hop of pbl

It will prevent multiply overflow when defines the pbl for u64 type.

Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/hns: Encapsulate and simplify qp state transition
Lijun Ou [Sat, 8 Dec 2018 10:40:08 +0000 (18:40 +0800)]
RDMA/hns: Encapsulate and simplify qp state transition

This patch move the codes of qp state transition into the new function as
well as simplify the logic for other qp states transition.

Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agoRDMA/hns: Init qp context when modify qp from reset to init
Lijun Ou [Sat, 8 Dec 2018 10:40:07 +0000 (18:40 +0800)]
RDMA/hns: Init qp context when modify qp from reset to init

It needs to clear qp context previous when init qp context.  Otherwise,
the newly created qp context residue has the contents of the qp context
before the uninstall, and the qp context content is disordered, causing
the task to fail.

Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
5 years agonet/mlx5: Remove the get protocol device interface entry
Or Gerlitz [Mon, 10 Dec 2018 21:15:17 +0000 (13:15 -0800)]
net/mlx5: Remove the get protocol device interface entry

This isn't used anywhere across the mlx5 driver stack,
remove it.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Support extended destination format in flow steering command
Eli Britstein [Mon, 10 Dec 2018 21:15:16 +0000 (13:15 -0800)]
net/mlx5: Support extended destination format in flow steering command

Update the flow steering command formatting according to the extended
destination API.
Note that the FW dictates that multi destination FTEs that involve at
least one encap must use the extended destination format, while single
destination ones must use the legacy format.
Using extended destination format requires FW support. Check for its
capabilities and return error if not supported.

Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: E-Switch, Change vhca id valid bool field to bit flag
Eli Britstein [Mon, 10 Dec 2018 21:15:15 +0000 (13:15 -0800)]
net/mlx5: E-Switch, Change vhca id valid bool field to bit flag

Change the driver flow destination struct to use bit flags with the vhca
id valid being the 1st one. The flags field is more extendable and will
be used in downstream patch.

Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Introduce extended destination fields
Eli Britstein [Mon, 10 Dec 2018 21:15:14 +0000 (13:15 -0800)]
net/mlx5: Introduce extended destination fields

Extended destinations provide the ability to configure different
encapsulation properties per destination on a single FTE. This is
needed for use-cases such as remote mirroring over tunneled networks.

Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Revise gre and nvgre key formats
Oz Shlomo [Mon, 10 Dec 2018 21:15:13 +0000 (13:15 -0800)]
net/mlx5: Revise gre and nvgre key formats

GRE RFC defines a 32 bit key field. NVGRE RFC splits the 32 bit
key field to 24 bit VSID (gre_key_h) and 8 bit flow entropy (gre_key_l).

Define the two key parsing alternatives in a union, thus enabling both
access methods.

Signed-off-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Eli Britstein <elibr@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Add monitor commands layout and event data
Eyal Davidovich [Mon, 10 Dec 2018 21:15:12 +0000 (13:15 -0800)]
net/mlx5: Add monitor commands layout and event data

Will be used in downstream patch to monitor counter changes
by the HCA and report it to the driver by an event.
The driver will update its counters cached data accordingly.

Signed-off-by: Eyal Davidovich <eyald@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Add support for plugged-disabled cable status in PME
Mikhael Goikhman [Mon, 10 Dec 2018 21:15:11 +0000 (13:15 -0800)]
net/mlx5: Add support for plugged-disabled cable status in PME

Support a new hardware module status in port module events:
- module_status=0x4 (Cable plugged, but disabled)

Signed-off-by: Mikhael Goikhman <migo@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Add support for PCIe power slot exceeded error in PME
Mikhael Goikhman [Mon, 10 Dec 2018 21:15:10 +0000 (13:15 -0800)]
net/mlx5: Add support for PCIe power slot exceeded error in PME

Support a new hardware error type in port module events:
- error_type=0xc (PCIe system power slot exceeded)

Signed-off-by: Mikhael Goikhman <migo@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Rework handling of port module events
Mikhael Goikhman [Mon, 10 Dec 2018 21:15:09 +0000 (13:15 -0800)]
net/mlx5: Rework handling of port module events

Add explicit HW defined error values. For simplicity, keep counters for all
statuses starting from 0, although currently status=0 is not used.

Additionally, when HW signals an unexpected cable status, it is reported
now rather than ignored. And status counter is now updated on errors.

Signed-off-by: Mikhael Goikhman <migo@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5: Move flow counters data structures from flow steering header
Saeed Mahameed [Wed, 5 Dec 2018 02:03:03 +0000 (18:03 -0800)]
net/mlx5: Move flow counters data structures from flow steering header

After the following flow counters API refactoring:
("net/mlx5: Use flow counter IDs and not the wrapping cache object")
flow counters private data structures mlx5_fc_cache and mlx5_fc are
redundantly exposed in fs_core.h, they have nothing to do with flow
steering core and they are private to fs_counter.c, this patch moves them
to where they belong and reduces their exposure in the driver.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agoIB/mlx5: Use helper to get CQE opcode
Tariq Toukan [Wed, 5 Dec 2018 02:03:02 +0000 (18:03 -0800)]
IB/mlx5: Use helper to get CQE opcode

Use the new helper that extracts the opcode
from a CQE (completion queue entry) structure.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Leon Romanovsky <leonro@mellanox.com>