Message ID | 20191025010351.30298-8-joel@jms.id.au |
---|---|
State | New |
Headers | show |
Series | AST2600 FSI speed improvements | expand |
Hi Joel, > The driver can skip doing some of the AHB2OPB setup if the operation is > of the same type. Experiment with this for full word reads, which could > be extended to writes if it shows an improvement. I would have taken a slightly different approach here - to do the "last value" caching at the OPB register set level, rather than working at the FSI-operation level. ie, keep a copy of the AHB registers that control OPB transactions, then only set them when they differ from the intended writes. That would give us something like this for opb_read(): if (aspeed->opb0_select != 0x1) { aspeed->opb0_select = 0x1; writel(aspeed->opb0_select, base + OPB0_SELECT); } if (aspeed->opb0_rw != CMD_READ) { aspeed->opb0_rw = CMD_READ; writel(aspeed->opb0_rw, base + OPB0_RW); } if (aspeed->opb0_xfer_size != XFER_WORD) { aspeed->opb0_xfer_size = XFER_WORD; writel(aspeed->opb0_xfer_size, base + OPB0_XFER_SIZE); } if (aspeed->opb0_addr != addr) { aspeed->opb0_addr = addr; writel(aspeed->opb0_addr, base + OPB0_FSI_ADDR } writel(0x1, base + OPB_IRQ_CLEAR); writel(0x1, base + OPB_TRIGGER); [which we could simplify with a helper function for the caching...] However, if the aim is to just stage this to just the fullword reads for now, this looks fine for me. Cheers, Jeremy
On Tue, 29 Oct 2019 at 03:17, Jeremy Kerr <jk@ozlabs.org> wrote: > > Hi Joel, > > > The driver can skip doing some of the AHB2OPB setup if the operation is > > of the same type. Experiment with this for full word reads, which could > > be extended to writes if it shows an improvement. > > I would have taken a slightly different approach here - to do the "last > value" caching at the OPB register set level, rather than working at the > FSI-operation level. ie, keep a copy of the AHB registers that control > OPB transactions, then only set them when they differ from the intended > writes. > > That would give us something like this for opb_read(): > > if (aspeed->opb0_select != 0x1) { > aspeed->opb0_select = 0x1; > writel(aspeed->opb0_select, base + OPB0_SELECT); > } > > if (aspeed->opb0_rw != CMD_READ) { > aspeed->opb0_rw = CMD_READ; > writel(aspeed->opb0_rw, base + OPB0_RW); > } > > if (aspeed->opb0_xfer_size != XFER_WORD) { > aspeed->opb0_xfer_size = XFER_WORD; > writel(aspeed->opb0_xfer_size, base + OPB0_XFER_SIZE); > } > > if (aspeed->opb0_addr != addr) { > aspeed->opb0_addr = addr; > writel(aspeed->opb0_addr, base + OPB0_FSI_ADDR > } > > writel(0x1, base + OPB_IRQ_CLEAR); > writel(0x1, base + OPB_TRIGGER); > > [which we could simplify with a helper function for the caching...] I am not convinced that this would be faster. In the best case (same operation) it's N branches. In the worst case, it's N branches and N mmio operations. I don't have any data for how often the values change, so that's something I could collect. I'll defer merging this change until we have more data. Cheers, Joel > > However, if the aim is to just stage this to just the fullword reads for > now, this looks fine for me. > > Cheers, > > > Jeremy >
diff --git a/drivers/fsi/fsi-master-aspeed.c b/drivers/fsi/fsi-master-aspeed.c index 7e515b43b7a6..30e818728402 100644 --- a/drivers/fsi/fsi-master-aspeed.c +++ b/drivers/fsi/fsi-master-aspeed.c @@ -92,6 +92,8 @@ struct fsi_master_aspeed { void __iomem *base; struct clk *clk; + bool last_was_fw_read; + struct dentry *debugfs_dir; struct fsi_master_aspeed_debugfs_entry debugfs[FSI_NUM_DEBUGFS_ENTRIES]; }; @@ -205,6 +207,8 @@ static u32 opb_write(struct fsi_master_aspeed *aspeed, uint32_t addr, writel(0x1, base + OPB_IRQ_CLEAR); writel(0x1, base + OPB_TRIGGER); + aspeed->last_was_fw_read = false; + ret = readl_poll_timeout(base + OPB_IRQ_STATUS, reg, (reg & OPB0_XFER_ACK_EN) != 0, 0, 10000); @@ -224,6 +228,43 @@ static u32 opb_write(struct fsi_master_aspeed *aspeed, uint32_t addr, return 0; } +static int opb_read_repeat_fullword(struct fsi_master_aspeed *aspeed, + uint32_t addr, u32 *out) +{ + void __iomem *base = aspeed->base; + u32 result, reg; + int status, ret; + + writel(addr, base + OPB0_FSI_ADDR); + writel(0x1, base + OPB_IRQ_CLEAR); + writel(0x1, base + OPB_TRIGGER); + + ret = readl_poll_timeout(base + OPB_IRQ_STATUS, reg, + (reg & OPB0_XFER_ACK_EN) != 0, + 0, 10000); + + status = readl(base + OPB0_STATUS); + + result = readl(base + OPB0_FSI_DATA_R); + + trace_fsi_master_aspeed_opb_read(addr, 4, result, + readl(base + OPB0_STATUS), + reg); + + /* Return error when poll timed out */ + if (ret) + return ret; + + /* Command failed, master will reset */ + if (status & STATUS_ERR_ACK) + return -EIO; + + if (out) + memcpy(out, &result, 4); + + return 0; +} + static int opb_read(struct fsi_master_aspeed *aspeed, uint32_t addr, size_t size, u32 *out) { @@ -241,6 +282,9 @@ static int opb_read(struct fsi_master_aspeed *aspeed, uint32_t addr, writel(0x1, base + OPB_IRQ_CLEAR); writel(0x1, base + OPB_TRIGGER); + if (size == 4) + aspeed->last_was_fw_read = true; + ret = readl_poll_timeout(base + OPB_IRQ_STATUS, reg, (reg & OPB0_XFER_ACK_EN) != 0, 0, 10000); @@ -311,7 +355,11 @@ static int aspeed_master_read(struct fsi_master *master, int link, return -EINVAL; addr += link * FSI_HUB_LINK_SIZE; - ret = opb_read(aspeed, fsi_base + addr, size, val); + + if (aspeed->last_was_fw_read) + ret = opb_read_repeat_fullword(aspeed, fsi_base + addr, val); + else + ret = opb_read(aspeed, fsi_base + addr, size, val); ret = check_errors(aspeed, ret); if (ret)
The driver can skip doing some of the AHB2OPB setup if the operation is of the same type. Experiment with this for full word reads, which could be extended to writes if it shows an improvement. Signed-off-by: Joel Stanley <joel@jms.id.au> --- drivers/fsi/fsi-master-aspeed.c | 50 ++++++++++++++++++++++++++++++++- 1 file changed, 49 insertions(+), 1 deletion(-)